durusmail: quixote-users: Re: Deferrer: helper class for spawning single use threads
Deferrer: helper class for spawning single use threads
2004-02-28
Re: Deferrer: helper class for spawning single use threads
2004-03-04
Re: Deferrer: helper class for spawning single use threads
2004-03-04
2004-03-04
2004-03-05
Re: Deferrer: helper class for spawning single use threads
2004-03-05
Re: Deferrer: helper class for spawning single use threads
2004-03-05
Re: Deferrer: helper class for spawning single use threads
Titus Brown
2004-03-04
-> >I have something here that some of you might find useful.  It's a (probably
-> >poorly named) module for handling tasks in a separate thread, and storing
-> >the result until the original thread comes back to get it.  It isn't a
-> >tuple
-> >space, but it was -in part anyway - inspired by the tuple space discussion
-> >on this list a several months back relating to approaches on handling long
-> >running requests.
->
-> I wanted to say thanks for this, Jason. I still haven't had the time to
-> take a close look at the code, but this is a good building-block to have
-> at hand. When I get a bit of free time, I'll play the friendly critic
-> and give your code a test-drive.
->
-> My CMS project has a need for a defered-request system, but I want it to
-> scale and distribute across multiple processes/servers, so I'm going to
-> try the tuple-space approach. It makes development a bit tougher -- it's
-> hard (for me) to re-imagine a 'monolithic' program in terms of multiple
-> processes running on multiple machines -- but I have hopes that it will
-> work for me in the long run.

Hi, Graham,

in Cartwheel, a bioinformatics framework that's been running for several
years, I use a PostgreSQL database as the hub of a system that distributes
jobs across a Beowulf cluster.  The underlying mechanism used is a tuple
space.

Jobs are submitted via several mechanisms: either a Web site, or Web services,
or direct database access.  The jobs then get picked up by nodes polling
for available jobs.  I use PostgreSQL to deal with transactions and database
locking: certain overkill, but quite effective nonetheless.

Overall, the system works quite well, with only one exception: most of
the jobs involve calling out to an external binary program (I need to run
several closed source binaries, groan), and if that program uses up a lot
of memory or otherwise dies badly, the job can die w/o any record.  There
are ways to control for that, but they all involve adding complexity;
controlling and monitoring the remote processes seems to be one of the
places where Linda tuple spaces restrict your options.

I've thought about using Pyro to do some inter-node communication, but
I haven't had great luck with Pyro and am unwilling to add it in.  You
might consider taking a look at it, though; I suspect I just haven't put
in the time to understand it properly.

Hope this helps ;).  I'm very happy with the tuple space mechanism and
I think it's a nice lightweight way to distribute jobs.

cheers,
--titus


reply