durusmail: quixote-users: Re: Deferrer: helper class for spawning single use threads
Deferrer: helper class for spawning single use threads
2004-02-28
Re: Deferrer: helper class for spawning single use threads
2004-03-04
Re: Deferrer: helper class for spawning single use threads
2004-03-04
2004-03-04
2004-03-05
Re: Deferrer: helper class for spawning single use threads
2004-03-05
Re: Deferrer: helper class for spawning single use threads
2004-03-05
Re: Deferrer: helper class for spawning single use threads
Titus Brown
2004-03-05
[ apologies if this seems OT; just yell at me and we'll move ;) ]

On Thu, Mar 04, 2004 at 01:52:01PM -0500, Graham Fawcett wrote:
-> Titus Brown wrote:

[ munch ]

-> >Overall, the system works quite well, with only one exception: most of
-> >the jobs involve calling out to an external binary program (I need to run
-> >several closed source binaries, groan), and if that program uses up a lot
-> >of memory or otherwise dies badly, the job can die w/o any record.  There
-> >are ways to control for that, but they all involve adding complexity;
-> >controlling and monitoring the remote processes seems to be one of the
-> >places where Linda tuple spaces restrict your options.
->
-> As I feared. :-(   I also will need calls out to remote binaries.

Mine sometimes involve jobs that run for ~20-30 hours (sequence database
searches) and that can crash.  I've opted to "force" manual intervention
in those cases because they're relatively rare and that way I get some idea
of how frequently they occur.  Obviously not a long term solution...

I do retain jobs in the database (with a 'taken' flag to indicate
that they're no longer available) so it's a simple matter to reset that.
It's merely annoying to people waiting for the job to finish ;).

[ munch of leases ]

-> I've considered having the agent process, which communicates with the
-> tuple space, spawn a second process to do the actual work. If anything
-> is going to die, my reasoning goes, it will be the heavy-lifting
-> process. If that process doesn't return an OK to the communications
-> process within an expected time, then the comms process could push the
-> job's tuple back into the space, and perhaps kill itself. Essentially
-> it's a "rollback on timeout" in a long-running transaction. I don't have
-> a whole lot of heavily synchronized stuff going on, so this is about as
-> transactional as I would need to get.

I have toyed with this idea, and will probably do this.  It would more than
quadruple the amount of job-generic code running on the client, though,
and I've held off because of that.  (OK, OK, quadrupling 10 lines isn't
so serious...)

-> I'm a bit leery of my ideas, and that's why I'm sharing them with you;
-> feedback is most welcome! The whole tuple space idea is clean and
-> elegant, and my ideas are messy and roughshod. I feel like a Visigoth
-> draping bearskins in the temple of Pallas, to make it feel more like home...

I just try to be an effective Visigoth these days ;).

-> >I've thought about using Pyro to do some inter-node communication, but
-> >I haven't had great luck with Pyro and am unwilling to add it in.  You
-> >might consider taking a look at it, though; I suspect I just haven't put
-> >in the time to understand it properly.
->
-> I really would like to leave the door open for agents to be written in
-> any language (though certainly in Python for the near future), so Pyro
-> might be a step in the wrong direction for me. And I'm really taken with
-> the simplicity factor of tuple spaces. If I can write or find a more
-> efficient implementation down the road, it should be relatively easy to
-> integrate into my other work, since the tuple-space semantics (and API)
-> should be almost unchanged.

There might be room for a Web services-style implementation; your project
sounds interesting!  OTOH tuple spaces are *so* simple that unless you
really solve a hard problem... well, you get the idea.

If you have any references you have found useful, I'd be interested in
them.  I didn't even realize JavaSpaces existed; heck, I built the system
before I had ever heard of a tuple space...

cheers,
--titus


reply