Here is another idea, that seems really easy and effective for startup. I can change the qp script so that "qp start" is actually the same as "qp -du". That way, the pid files are cleaned up, if present, on a normal startup, and I think the "qp" script will work without changes as a startup script. > > site.stop_durus() > while site.is_durus_running(): > time.sleep(1) > site.start_durus() > I think a RuntimeError is better than this possibly infinite loop. > I'm leaning towards 2. Or, if is_durus_running() doesn't get > modified, I could just sleep for say 3 seconds between the > stop_durus and start_durus calls to create a big enough window > where the odds of a race condition are essentially none. Still a > little kludgey but should be effective enough. The next Durus release delays grabbing the lock until a write operation is started, so and there is a built-in delay in startup that should be enough to make sure that the previous process has plenty of time to die and release the lock. Also, stop_durus already waits for the bound socket to become available, and it seems quite unlikely that that will happen before the file lock is released. I doubt if any delay is necessary. > >> I think there will always be the remote possibility of a race, >> even if we check that locks are available. The critical thing >> is that only one writer wins the race. > > True. However, it looks like start_durus doesn't retry if Durus > can't lock the database file; Durus just exits with a RuntimeError. > Then start_durus calls wait_for_server and after a timeout, exits > the parent process raising SystemExit. I suppose I could catch > SystemExit and assume what happened to the Durus process and try to > start it again.... If wait_for_server times out, I don't think an automatic retry will help.