Re: [Durus-users] Revised paging-in patch

Revised paging-in patch
2006-07-07
A.M. Kuchling
2006-07-08
2006-07-08
2006-07-08
2006-07-10
2006-07-11
2006-07-14
Revised paging-in patch
Jesus Cea
2006-07-08
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

A.M. Kuchling wrote:
> +            for oid, record in response:
> +                s.sendall(oid + p32(len(record)) + record)

Why don't create a "response" list with "oid + p32(len(record)) +
record" already created?. So you could do a single "s.sendall(response)"
or "s.sendall("".join(response))" call, instead of a loop and up to 100
"s.sendall()".

> +                if hasattr(obj, '_p_oid') and obj._p_is_ghost():
> +                    refs.append(obj._p_oid)
> +                    assert obj._p_is_ghost()

That assert is unnecessary because you test it inside an "if" with the
very same condition.

> +                # Perform a bulk load of the first 100 queue elements
> +                oid_list = queue[:100]
> +                del queue[:100]

Could be the quantity something negotiated between the client and the
server?. I don't like static numbers, moreover if that numbers could be
tunable. Currently to change that value we would need to edit both the
clients and the server.

Maybe a simple option would be something like this:

1. Each client connection keeps two internal variables: the maximum
number of requested items until now and a flag called "capped".
Initially maximum=100 and capped=False.

2. When sizing the "bulk_load()", test the "capped" flag. If true, ask
for "maximum" OIDs. If false, do "maximum=maximum*1.5", for example.

3. When receiving the "bulk_load()" answer, count the objects received.
If received=maximum, do nothing, but if received +                # Filter out OIDs we've already seen
> +                oid_list = [oid for oid in oid_list
> +                            if oid not in seen]
> +                if len(oid_list) == 0:
> +                    # If list is empty, don't bother sending a
> +                    # bulk-load request for 0 OIDs.
> +                    continue
> +
> +                # Request bulk load
> +                records = self.storage.bulk_load(oid_list)

I think that filtering OIDs already seen should be done before spliting
the bulk_load, since current code could do "bulk_load()" of far fewer
OIDs that 100, if the filtering has a lot of hits. Also probably
"filter()" functional call would be more efficient. Some profile
comparison would be nice.

> +                    new_obj = self.cache.get(oid)
> +                    if not new_obj._p_is_ghost():
> +                        # Already resident, so skip it
> +                        continue
> +                    ##print repr(oid), new_obj, new_obj._p_is_ghost()
> +                    if new_obj is None:

You already determined that "new_obj" was a ghost, via "list_refs()"
utility function. The "if new_obj is None" will be always false because
the object is ALWAYS in cache, AS A GHOST. In fact if "new_obj" is None,
then "if not new_obj."" would crash the application, since "None" has no
"_p_is_ghost()" method. So you can delete your code there, since the
entire "if" is dead code.

> +                    queue.extend(split_oids(refdata))
> +                    seen.add(oid)

Perhaps you could filter "refdata" against "seen" before adding to
queue. I'm not sure if that could be profitable.

If I understand correctly your code, you could load the entire object
closure in RAM. So if you do a "page_in()" on the "root" object you
could load in RAM the entire object database. Am I right?.

- --
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea@argo.es http://www.argo.es/~jcea/ _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea@jabber.org         _/_/    _/_/          _/_/_/_/_/
                               _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iQCVAwUBRK8jUZlgi5GaxT1NAQJQqQP/eFb5SV07XD+9Vzh2FQ73EV/WP4BlUm7Y
ds/8Fc8zlosKZ1/8nxeQolaRXkFDndQ3wtu0kfRv0IMeRjIBLWfLHiAbXlIpKDK3
u1SNP84MuKnXMSHPvOPs/iDNRh8xpSpgPv5GOyILWWfwFTLtkbxdlUDTuPrklu6S
R82xgX25G8E=
=PX5m
-----END PGP SIGNATURE-----