Hi David, Thanks for having a look and sorry for the slow response. Underlying a lot of my ongoing experimentation is growing databases, many GBs so far and continuing to grow. Getting fool proof fast replication running against big databases is a priority. I currently use full rsyncs but have been trying thinking of ways to be more efficient. One of my hats in my day job is as a sysadmin and we use rsync heavily on busy machines and I've grown to be wary of the storm of IO it can generate on lots of data. Getting rsync to not read all of the source and destination requires the append option that has been spoken about before which leads to the issue of ensuring that the destination file is a strict subset of the source one. This is where my experimentation started; how can we know that the destination is that strict subset? My first thought was to just make a unique header for each file and compare those but then realised that to append anything to the slave the data has to come from the same point in the master which is hard to know after any interruption of the appending occurs, eg. slave server is getting an update and is offline for 10 minutes, from where in the master does the data get read? The two options to this issue, as I see it, are to run a full rysnc for each replication run so that the slave state can be anything at all and it will be cleaned up or to keep track of some structure in the slave and master and compare where the slave is at with the master and therefore be able to append cleanly. I was very pleasantly surprised at how little change to the shelf code was needed to get enough structure to use. I hope that clears up where I'm coming from, if nothing else I continue to learn a lot about Durus which is delightfully small for what it does, small enough for the concepts and code to fit in my head ;-) Peter W. On 26/06/2009, at 11:00 PM, Binger David wrote: > It seems like your replication strategy works unnecessarily hard to > track transaction > boundaries. If the master fails and the slave has a partial > transaction, then the > new server process would need to truncate the partial transaction at > startup, > just as it would if the same condition happened without replication > involved. > > The careful rsync strategy that I think I've posted here earlier can > easily > run every minute, and it recognizes when packs happen. If you need > more frequent > updates, I think you can use the same inode-checking strategy along > with > a remote "tail -f" to get the job done. Is that not right? > > I think what you've done is cool, I'm just not sure if it is cool > enough to change the > file format. Am I overlooking something? > >