For Google's sake primarily, here is a comparison of three storages,
FileStorage2, ShelfStorage, and PostgresStorage.
Anytime a large number of new objects are being created and
committed to a database, of any type, there's no escaping the need
to think about the creation/persistence strategy unless you have
infinite RAM available.
Conclusion: for many use cases, ShelfStorage is a real winner!
Observation: FileStorage2 is a more efficient writer.
1. Create a new storage
-----------------------
500,000 "NewsItem" instances -
FileStorage2 341 (341MB RAM consumed at max)
ShelfStorage 535 (375MB)
PgStorage 867 Python: approx 105MB, pg: 30 - 45MB
2. Migrate an existing storage
------------------------------
A low level "copy" of a file-based storage to pgsql Storage - this
shows off Postgresql's "COPY" statement.
533,337 objects: 74.24 seconds
(Note that 23 seconds or so is the startup time of the
file-based storage... so about 50 seconds to do a bulk load once
you have your object record prepared.)
If I had the desire to re-design the Postgresql storage I think I'd
experiment with using COPY for doing regular object inserts into the
db.
3. Time to Pack
---------------
(following initial commit of 500,000 new object instances; does not
include start up time)
Seconds RAM Consumed During
FileStorage2 52 214MB
ShelfStorage 247 74
PgStorage 163 29 (Postgresql server)
44 (Python process)
4. Start Up Times
-----------------
(time to get "root")
Seconds RAM Consumed
FileStorage2
Before pack 12.316 75MB
After pack 3.923 104MB
ShelfStorage
Before pack 18.696 75MB
After pack 0.001 14MB
PgStorage 0.084 Python: 15MB Postgres: 18MB
5. Time to Access Objects
-------------------------
Best of three runs; random object from within the 500,000 news items
returned.
(All times after pack, in seconds)
Constant |--- Random ---------------------
1 10 100 1000
FileStorage2 0.006 0.128 0.294 1.970
ShelfStorage 0.010 0.063 0.317 2.113
PgStorage 0,008 0.099 0.576 3.372
:!qpyrun.py testshelfstorage.py |& tee /tmp/v341441/1942
time to root 23.8985710144
Executing:
Iterations:1
Elapsed:0.0801169872284
Last result: ('Get one', , 'News item 122')
Executing:
Iterations:1
Elapsed:0.0923748016357
Last result: ('last of 10', , 'News item 488436')
Executing:
Iterations:1
Elapsed:0.402265071869
Last result: ('last of 100', , 'News item 335927')
Executing:
Iterations:1
Elapsed:2.19541192055
Last result: ('last of 1000', , 'News item 67932')
time to root 0.00104188919067
Executing:
Iterations:1
Elapsed:0.00779795646667
Last result: ('Get one', , 'News item 122')
Executing:
Iterations:1
Elapsed:0.0480928421021
Last result: ('last of 10', , 'News item 298789')
Executing:
Iterations:1
Elapsed:0.35208106041
Last result: ('last of 100', , 'News item 280986')
Executing:
Iterations:1
Elapsed:2.34385490417
Last result: ('last of 1000', , 'News item 202171')
:!qpyrun.py testshelfstorage.py |& tee /tmp/v341441/2046
time to root 0.00103092193604
Executing:
Iterations:1
Elapsed:0.00763297080994
Last result: ('Get one', , 'News item 122')
Executing:
Iterations:1
Elapsed:0.0988612174988
Last result: ('last of 10', , 'News item 328562')
Executing:
Iterations:1
Elapsed:0.811402082443
Last result: ('last of 100', , 'News item 261994')
Executing:
Iterations:1
Elapsed:6.95415091515
Last result: ('last of 1000', , 'News item 363208')
g# /www/lib/parlez/test% python -OO testshelfstorage.py
time to root 0.00103402137756
Executing:
Iterations:1
Elapsed:0.00691914558411
Last result: ('Get one', , 'News item 122')
Executing:
Iterations:1
Elapsed:0.109712123871
Last result: ('last of 10', , 'News item 337818')
Executing:
Iterations:1
Elapsed:0.549973964691
Last result: ('last of 100', , 'News item 462327')
Executing:
Iterations:1
Elapsed:4.48995804787
Last result: ('last of 1000', , 'News item 156833')
16MB --- frog# /www/lib/parlez/test% python -OO testshelfstorage.py
time to root 0.00104188919067
Executing:
Iterations:1
Elapsed:0.0073230266571
Last result: ('Get one', , 'News item 122')
Executing:
Iterations:1
Elapsed:0.108702898026
Last result: ('last of 10', , 'News item 269579')
Executing:
Iterations:1
Elapsed:0.528451919556
Last result: ('last of 100', , 'News item 225540')
Executing:
Iterations:1
Elapsed:4.08506298065
Last result: ('last of 1000', , 'News item 219090')
104MB --- frog# /www/lib/parlez/test% python -OO testfilestorage.py
time to root 3.99423408508
Executing:
Iterations:1
Elapsed:0.00610613822937
Last result: ('Get one', , 'News item 122')
Executing:
Iterations:1
Elapsed:0.111690998077
Last result: ('last of 10', , 'News item 7516')
Executing:
Iterations:1
Elapsed:0.299579143524
Last result: ('last of 100', , 'News item 176199')
Executing:
Iterations:1
Elapsed:1.95170402527
Last result: ('last of 1000', , 'News item 78696')
PG
time to root 0.0838949680328
Executing:
Iterations:1
Elapsed:0.0108370780945
Last result: ('Get one', , '122 news item')
Executing:
Iterations:1
Elapsed:0.0615499019623
Last result: ('last of 10', , '286915 news item')
Executing:
Iterations:1
Elapsed:0.465358018875
Last result: ('last of 100', , '274957 news item')
Executing:
Iterations:1
Elapsed:3.03280711174
Last result: ('last of 1000', , '362657 news item')
time to root 0.08460688591
Executing:
Iterations:1
Elapsed:0.0102488994598
Last result: ('Get one', , '122 news item')
Executing:
Iterations:1
Elapsed:0.0701730251312
Last result: ('last of 10', , '204234 news item')
Executing:
Iterations:1
Elapsed:0.8127348423
Last result: ('last of 100', , '275712 news item')
Executing:
Iterations:1
Elapsed:3.86406707764
Last result: ('last of 1000', , '142954 news item')
time to root 0.0844349861145
Executing:
Iterations:1
Elapsed:0.0102880001068
Last result: ('Get one', , '122 news item')
Executing:
Iterations:1
Elapsed:0.117120981216
Last result: ('last of 10', , '404922 news item')
Executing:
Iterations:1
Elapsed:0.478250980377
Last result: ('last of 100', , '185840 news item')
Executing:
Iterations:1
Elapsed:3.01914596558
Last result: ('last of 1000', , '198043 news item')