On Oct 17, 2005, at 5:15 PM, David Binger wrote: > On Oct 17, 2005, at 10:56 AM, mario ruggier wrote: > >> The 3-line iteration over self._items is where it all happens... and >> it is (almost) pure BTree code. self.get_item_key() just builds a >> tuple of attr values from item. Looking at BTree's and BNode's >> __iter__ methods, not clear to me why this will require all items to >> be in memory. > > It is because you can't get the attr values from the item without the > item being loaded, > and Durus does not automatically flush object state from memory except > when shrink_cache() > is called. Ah, I was not aware of that! I am playing with shrink_cache()... first of all I am setting a iteration chunk size of: chunk = int(self._p_connection.cache.get_size()/2) It is half the cache_size, as the objects that must be loaded are both the values, but also persistent objects used in the key that is a tuple, for each key,value added to the BTree. Then inside the iteration loop, when I hit a chunk count, I am doing: cache_count = self._p_connection.cache.get_count() self._p_connection.shrink_cache() shrunk_count = self._p_connection.cache.get_count() However, cache_count and shrunk_count are almost always the same, and well above the value of the connection's cache_size (even the first time around). After a few chunks, some objects seem to be removed from the cache, but very few... The Virtual Memory size of the process continues to increase slowly. Here's some specific numbers (running under profile actually), for a connection with a cache_size of 50000. It seems that around the 200K object mark (iteration) it really starts to flail... The data columns printed below are: num chunk * chunk_size : cache_count ~ shrunk_count : chunk time ~ tot time $ python build_stocks_db_indices.py .... Rebuilding Quotes index: ('symbol', 'date') 1 * 25000 : 26804 ~ 26804 : 106.8546 ~ 106.8593 2 * 25000 : 53482 ~ 53482 : 129.3596 ~ 236.2189 3 * 25000 : 80191 ~ 80191 : 139.9835 ~ 376.2024 4 * 25000 : 106875 ~ 106875 : 151.0538 ~ 527.2562 5 * 25000 : 133570 ~ 133093 : 158.1503 ~ 685.4065 6 * 25000 : 159771 ~ 158076 : 203.4847 ~ 888.8912 7 * 25000 : 184775 ~ 183403 : 224.9891 ~ 1113.8803 8 * 25000 : 210126 ~ 208997 : 248.6730 ~ 1362.5533 9 * 25000 : 235657 ~ 234545 : 401.5871 ~ 1764.1404 10 * 25000 : 261258 ~ 259992 : 661.5155 ~ 2425.6560 mario