durusmail: quixote-users: Forking question
Forking question
2004-05-19
2004-05-19
2004-05-19
2004-05-19
2004-05-19
2004-05-19
2004-05-19
2004-05-19
2004-05-19
2004-05-19
2004-05-20
Forking question
Evan LaForge
2004-05-19
> what is the best way to handle this type of situation?  should the
> quixote class handling the request take the information, setup the long
> process and fork a child to run the long running process, while the
> parent returns the reponse?  or should we pass the information to some
> "thing" (be it database/config file/whatever) and have a separate
> process that polls for entries to that thing that then handles running
> the long running process?

There's nothing wrong with starting a process immediately to handle the

An advantage of a long running process is that all processing is serialized,
so if your 1 hour job is CPU / disk intensive, you may not want 12 of them
all running at once.  It could be as simple as:

quixote:

def process_request(...):
    # use user.name or something to identify the requester
    fn = '%s,%s' %(time.time(), request.user.name)
    open(os.path.join(request_dir, fn), 'w').write(req_info)
    return 'Come back in an hour or so!'
def check_request [html] (...):
    pending = [ fn for fn in os.listdir(request_dir)
        if fn.endswith(',' + request.user.name) ]
    # if it hasn't been deleted from pending it's still in-progress
    done = [ fn for fn in os.listdir(finished_dir)
        if fn.endswith(',' + request.user.name)
        if fn not in pending ]
    if done:
        'Your requests are done: %s' %('\n'.join([
            format_result(open(fn).read()) for fn in done ])
    else:
        'No requests are done.'
    if pending:
        'Pending requests: ...'


long running process:

cd $request_dir
while true; do
    req=`ls | sort | head -1` # this makes requests FIFO
    if [ "$req" ]; then
        giant_process <$req >$finished_dir/$req && rm $req
    else
        sleep 5
    fi
done

This also has the advantage that if your server crashes you won't lose all
running requests, since they are only deleted after giant_process completes.
If you want to run all giant_processes in parallel, simply put the
giant_process line in quotes and an & at the end.  If you want to run them in
parallel and you don't care about server crashes, then you could eliminate the
long running process by doing an os.spawnv directly from process_request, but
now check_request needs some other indication that a giant_process output file
is complete.


reply