On 19/06/2008, at 1:46 AM, Binger David wrote: > > On Jun 18, 2008, at 9:34 AM, Peter Wilkinson wrote: > >> This is very appealing and timely. Other than somehow flagging when >> packs have been done on the source database it would seem to be >> quite usable. Have you done tests to see what happens when an -- >> append is run when the content has changed earlier in the file than >> the destinations end? Is the keeping of that inline left up to the >> driver? > > > I haven't tested anything except the basic operation of this. > If, between two runs of rsync3 with --append, the database is packed > and subsequently grows larger than > it was on the last run of rsync3, I think the --append would produce > a bad backup file. > If you run it with --append at high frequency, the probability of > this is situation is low, > but not zero. > That's where the extra time spent by --append-verify would be > justified. > > I suppose you could also run an rsync on the log and run rsync with > --append-verify > when you see new Pack log messages. I've been having a look at this and how to wrap it up in some scripts and had a thought that might be doable and make this much easier and predictable. At the moment the shelf storage just writes a prefix of "SHELF-1" at the start of the file; might it make sense to put an additional bit of data after that before the transactions are started which would identify when this database was created/packed. A UUID would be ideal but a bit rough to require Python 2.5 but maybe a timestamp would be enough. This way a script to replicate the database would just need to read the first x bytes to compare to what it had to know what kind of rsync to perform, no knowledge outside of what is in the source and destination database would be needed. Regards, Peter Wilkinson.