Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I've written programs where threads make the program easier to reason about rather than harder.

I want to hear about this.



The one that comes to mind in particular was something that sent email. We wanted to limit emails being sent per email server to each ISP.

We went with: queue per ISP for emails to send out, then each server that could send email to that ISP had its own thread. Each thread would take X emails out of the queue every 10 seconds, send them, then sleep for any leftover time - or not if there isn't any.

Most of the time is spent sleeping or on I/O - the point of threads wasn't for CPU usage, it was because it made everything so simple to implement. The only synchronization issues were taking emails out of the queue and putting them back somewhere if they failed to send.

All of the logic for each thread in its main loop read like a normal, sequential program. I guess you could call this an embarrassingly easy problem, but part of why it's so easy is it maps so easily to threads.


Okay, here's one. =)

At work we recently wrote something for handling automated translations of content via our external review providers. Occasionally some stuff-to-translate gets pulled from a database, we figure out what it needs to be translated to, and job orders get dropped into queues for each (provider, from-lang, to-lang) tuple. Once the queue is processed, the translated content gets pushed back into the database.

One fairly major catch: each of our translation providers has different numbers of concurrent translation requests - we can do two simultaneous English->Spanish translations, say, but four English->French ones. So I wrote it such that each (provider, from-lang, to-lang) queue is serviced by its own threadpool that has with an upper bound on its active threads equal to the maximum that the external translation provider can concurrently handle, so we get in-flight management for free. (This could have ugly consequences in overloaded cases with tons of providers/tons of in-flight requests, but the machine it runs on is dedicated to this process and, as we tend to do, the JVM it runs on is provisioned with approximately eight hojillion bytes of RAM.)

There are certainly other ways to accomplish this task, but threading and threadpools made it conceptually a lot simpler to reason through. (Though evented was the first thing I thought of, we couldn't really do something using it--both because Java's support for it is poor and because of the nature of some of our translation providers. For one at least, we have to repeatedly hit an HTML page and scrape it to find out when our translation job is done!) We could functionally treat each module as a discrete case - the simple queued interface let the people writing the translation handlers treat it as if it was a single-threaded application. That's actually exactly how we wrote it, too: I told the other person on the project to just drop the class they were writing into main() and make sure it worked, while I built the infrastructure in which it'd actually run as a daemon. He didn't have to care about the threading, while it gave us resource control and ease of expansion.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: