durusmail: durus-users: building a large BTree efficiently
building a large BTree efficiently
2005-10-14
2005-10-14
2005-10-14
2005-10-17
2005-10-17
2005-10-17
2005-10-17
2005-10-17
2005-10-19
2005-10-20
2005-10-20
2005-10-20
2005-10-20
2005-10-20
2005-10-26
2005-10-26
2005-10-26
building a large BTree efficiently
David Binger
2005-10-17
On Oct 17, 2005, at 8:23 AM, mario ruggier wrote:

> OK, I have a few stats files if anyone cares to eyeball thru them.
>
> The more interesting is the second ordering, by tottime. Most of
> the time seems to be consumed by _p_load_state, load_state,
> get_state, by get_position, and __new__ (this last one is
> surprising, as to my understanding there should be zero new
> persistent objects resulting from this process). get_position's
> share of tottime is high for the 100K and 200K objects run, but
> surprisingly goes down for the 300K (where logically it should be
> doing more work).
>
> As the number of objects indexed increases, then get_state and
> __new__ share of time consumption grows to close to all of it.
>

When an object is loaded, the __new__ method is called.
All of the persistent instances are "new" to this process.

I think your tests are measuring read() and paging times
more than anything else.  Your rebuild_index function apparently
requires every object to be loaded and kept in RAM.  I don't think
you have enough RAM to do that.

You might see some change (could be better or could be worse)
by calling Connection.shrink_cache() occasionally, or by using a
BTree with a
different degree.







reply