Re: [Durus-users] A newcomer and BerkeleyDB

On Dec 15, 2005, at 9:17 AM, Jesus Cea wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> I'm sorry to bother you. I'm a new Durus user and I was wondering if
> anybody there is working in a BerkeleyDB storage backend. If not,
> would
> Durus developers accept external submissions in this area?.

I don't know of any such work.  I seem to remember that the ZODB
once had a BerkeleyDB storage, but I think support for it was dropped
for reasons I don't know.  You might want to look at ZODB-dev archives
for more information about what was done there, since I think it would
be very similar in the Durus context.

Personally, I'd be glad if there were such an option for Durus.
Because we would probably not be good maintainers of it, I would
probably not be eager to include it in the Durus distribution.  That
could change though, if the Durus user community seemed to think it
was important.  In any case, please don't let the inclusion question
influence your work.

> I'm using Durus in our testing environment and we are concerned about
> the memory usage and start/pack time in a multimillion and dinamic
> object database.
>
> I can elaborate, if you wish.

I'd like to hear more about it.

I'm sure you know this already, but for the benefit of others I'll
say that the memory usage of the storage server grows with the number of
instances because it keeps an in-memory index that holds the offset
for each instance in the file.  For a million instances, this typically
takes about 100MB of RAM, and it should scale linearly, so it would
need about a GB of RAM for 10 million instances.

How many instances do you want to support?

The start-up times are much better now than in earlier versions of Durus
because a pickle of the index is written into the FileStorage when you
pack the database.  The startup just has to unpickle the index
instead of building it from scratch by traversing all of the records in
the file.

Still, a different kind of Storage could support faster startup and
more instances by keeping all index information on disk, and I guess
that is what you are proposing.

Packing does two things, remove obsolete instance records and removes
records for objects that can't be reached by references starting at the
root object.  An alternative storage might eliminate the obsolete
instance
records as new versions are committed, so packing wouldn't have as much
to do, and that would be good.