-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 David Binger wrote: > For this purpose, the cache must provide access > to all of this connection's persistent instances referenced from > anywhere in this process. Understood. The weak/strong references game in the cache code is very smart. It deserves to be documented better as enlightment of future generations :) I think I find a bug in the cache shrink code, nevertheless: """ slice_size = max(min(self.size - current, current / 4), current / 64) """ When reaching this code, "current" is ALWAYS bigger than "self.size". So the "min()" always gives a negative number, so the "max()"always gives "current/64". That is, slice_size is ALWAYS "current/64". I seeing other issues with current code. For example, since "shrink()" is implicitely called when doing a commit/abort, and rules forbids to keep persistent references across transactions, the "cache._get_heap()" call will dispose a lot of objects when doing the "reference.make_weak()", without seeing if the object was referenced recently (because we don't keep references to persistent objects between transactions, the cache could be the only link to an object even if it is very "fresh"). In fact, if the cache is big and the commits touch few objects, the "cache._get_heap()" will dispose (statistically) all objects it touches. A better way would be to take "X" objects from the cache, sort them by "_p_touched" (that keeps the commit generation last touched such object), and do the "make_weak()" game in the "X/2" (or "X/10") more old objects. So we could keep the more recently touched objects in core. Also, the object selection should be random. Current approach is deterministic, and I can think about a cache content and access profile where the same objects are always analyzed (and other are not examined), or where a given slice has objects with the same "_p_touched" and so, they are always purged even if they are recently accessed. The obvious solution to this is to sort all cache by "_p_touched", do the "make_weak()"/"ghosting" magic in the first "x/64" objects (let's say) and increase its "_p_touched" to current level to allow new objects to be examined in the next shrink. I don't know how expensive could be to do a global sort by "_p_touched" compared to the "islice()" usage in "cache._get_heap" when "start" has a high value... Or maybe simply choose "x/64" random cache objects from the entire cache, sort them by "_p_touched" and to the "make_weak()"/"ghosting" miracle in the first (and so older) "x/128". PS: The "islice()" performance worries me. I did the following experiment: a) Creates a millon elements dictionary. b) Use "islice()" to iterate over the last 100.000 elements, null loop. Time in my machine: 0.0604 seconds. c) Use "random.sample()" to choose 100.000 random elements: 0.27 seconds. "islice()" is wonderfully efficient!!. Nevertheless, in this test the loop was "noop". If the loop body does "something", the difference with the concrete object used to iterate over would become irrelevant... - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea@argo.es http://www.argo.es/~jcea/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea@jabber.org _/_/ _/_/ _/_/_/_/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iQCVAwUBRHYG55lgi5GaxT1NAQKQSAP/WDeW6TQEOY7MlV4XZeV90RZhfpzbVDjt 0eycsdTs7DpEWbpQDwgKrsnNypnQH5SUPFhLM9pu8VvItwi64NGIJboo4rg7SZ0d kn8wL1LLvK/iblDgHYIckGN5T1LkfmPQjG+LkjQYW7T60Qb/Rpq4C9E8fikdiYeX QiyvzSnHWl4= =QTQC -----END PGP SIGNATURE-----