I've run the same suite using Jesus' BerkeleyDBStorage, restating
all for easy comparison (and I didn't realize I'd left a bunch of
copy and paste cruft laying about in the earlier message).
1. Create a new storage
-----------------------
500,000 "NewsItem" instances -
Seconds
FileStorage2 341 (341MB RAM consumed at max)
ShelfStorage 535 (375MB)
PgStorage 867 Python: approx 105MB, pg: 30 - 45MB
BerkeleyDBStorage
951 (346 RAM consumed at max)
file space consumed ~ 250MB - File storages
consume approx 98 - 105MB.
2. Time to Pack
---------------
(following initial commit of 500,000 new object instances; does not
include start up time)
Seconds RAM Consumed During
FileStorage2 52 214MB
ShelfStorage 247 74
PgStorage 163 29 (Postgresql server)
44 (Python process)
BerkeleyDBStorage
Note that this storage tracks objects for garbage collection during
normal operation; pack() has nothing to do unless there is garbage
to clean up - it does not examine every record in the storage as all
the other storage examples do.
0.014 - no deleted items - negligible
59 30MB (6655 'garbage objects')
3. Start Up Times
-----------------
(time to get "root")
Seconds RAM Consumed
FileStorage2
Before pack 12.316 75MB
After pack 3.923 104MB
ShelfStorage
Before pack 18.696 75MB
After pack 0.001 14MB
PgStorage 0.084 Python: 15MB Postgres: 18MB
BerkeleyDBStorage
Before pack 0.49* 26MB
After pack 0.081 20MB observed
*Not so sure about this number - may have been my system
4. Time to Access Objects
-------------------------
Best of three runs; random object from within the 500,000 news items
returned.
(All times after pack, in seconds)
Constant |--- Random ---------------------
1 10 100 1000
FileStorage2 0.006 0.128 0.294 1.970
ShelfStorage 0.010 0.063 0.317 2.113
PgStorage 0,008 0.099 0.576 3.372
BerkeleyDBStorage
0.008 0.042 0.343 2.229