durusmail: durus-users: Re: A question about consistence in durus
A question about consistence in durus
2006-04-21
2006-04-21
2006-04-21
2006-04-21
Re: A question about consistence in durus
2006-04-22
2006-04-22
2006-04-22
2006-04-22
2006-04-22
2006-04-23
2006-04-23
2006-04-23
2006-04-23
2006-04-23
2006-04-23
Re: A question about consistence in durus
Jesus Cea
2006-04-22
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Binger wrote:
> You are correct that we could track actual accesses since the
> last transaction, and use that to decide if we can avoid a ConflictError.
> I think that would require a little extra work on every
> attribute access of a persistent instance.  Would the benefit
> be worth the cost?  The answer is not obvious to me.

We only need to track first access to each object, per transaction. So
if we involve 50 objects in a transaction, we will only have 50
"attribute misses" that will be 50 cache hits.

Current durus has objects in one of three status:

SAVED: Object is in cache, unmodified
UNSAVED: Object is in cache, modified
GHOST: Object is not yet loaded.

We could add a fourth state, signaling a SAVED object not referenced in
current transaction yet. Or convert all objects to GHOST after each
transaction, but backing up them with a cache. So a GHOST object could
be promoted to SAVED/UNSAVED both from a durus server fetch and a local
cache, just like now.

The problem with current approach is "false conflict". The cost of
redoing an entire transaction, unnecessary, to cope with false conflicts
seems pretty high. If the cost of intercepting first attribute access
(only the first one!) is low (should be), then we should do it.

Just doing a check using a Sun X2100 machine (Opteron 2.2 Ghz):

A million accesses to an attribute: 0.167 seconds
A million accesses to an attribute via __getattr__: 1.11 seconds.

So accesses are slower by 7 times, but ONLY for the first time. Multiple
accesses to an object will pay the overhead just a single time.

By comparison, contacting the durus server, even if there are not
conflicts, if fairly more costly. A million "empty" commits takes 77.22
seconds. With the server in the same host and dual core CPU.

In other words, if you touch 1000 objects per transactions (a huge
number...) you only add a milisecond per transaction. Anything you do
with the 1000 objects will be more costly. If you touch 50 objects, your
overhead is 56 microseconds.

In any case, thinking about that, any (reasonable) overhead is tolerable
if you remember that you will be doing a syncronous disk write, finally.
And you only can do about 60 per second with usual hardware.

Finally, I think this property could be easily configurable, perhaps in
the "connection" constructor.

Anything more that I can do to convince you?

:-)

- --
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea@argo.es http://www.argo.es/~jcea/ _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea@jabber.org         _/_/    _/_/          _/_/_/_/_/
                               _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iQCVAwUBREqcoplgi5GaxT1NAQKHvgP/RJyp3LL2NHg+CfQ+AfaDhkcnjf757iV+
vGKwbUMKESam7j4QTF26RUfTVzBH37c9EF2qgb/A9L3U0gXUyRJhpWoDjJn1hDGh
wTGGQ4G6tfDNMWODn26yYstx17nHyqhOJqK06TlyxuUwZlA6JG+vK4dfOVPS9nC5
C+9ys33WGGg=
=hF3B
-----END PGP SIGNATURE-----
reply