On Mar 20, 2006, at 7:14 PM, Neil Schemenauer wrote: > David Bingerwrote: >> The format I'm thinking about for the index is simple: >> >> 8 bytes length >> 8 bytes maximum oid >> 8 bytes maximum offset >> Ordered array of (trimmed_oid, trimmed_offset) records. >> (By trimming, I mean with left-side bytes removed >> that are null for all values in the array.) > > How would this be read from disk in an efficient manner? Maybe we > need something like CDB: http://cr.yp.to/cdb.html. There is a > cdb.py module in the spambayes package. If my reading of the cdb spec is right, the amount of space required for the index is several times higher because it is made for variable length keys and values (which we do not have) and because it is a hash table with a non-perfect hash, and therefore must have a third or a half of the allocated slots empty. Whatever format we use for the on-disk index, it should take advantage of the fact that we know the lengths of the keys and values in advance, and we can afford (I think) to take the time to write them down in order. My idea for the simple array format is to use a fixed-size index-index in RAM to "hash" oids to the area of the array where the corresponding offsets are stored. Within each zone, we can use a bisection search or a linear search just as we would if we had collisions in a hash table. The fixed-size index-index serves the same purpose as cdb's initial index of 256 hash tables, except that it is not limited to that pre-determined number of segments. We also have an advantage over cdb in that we know we have a long-running server that is going to use the index. The cdb format has to be fast for one-shot accesses, but ours does not. With a fixed 100MB of RAM, I think you could obtain offsets of at least hundreds of millions of oids with pretty close to one disk read per oid. > > If we are serious about making Durus scale to huge databases then I > think there are other issues to fix as well. For example, even with > an on disk index for record offsets, packing still requires memory > proportional to the number of objects in the DB. I'm pretty sure we could modify the packing code to avoid that. > > Neil > > _______________________________________________ > Durus-users mailing list > Durus-users@mems-exchange.org > http://mail.mems-exchange.org/mailman/listinfo/durus-users