Re: [Durus-users] My wish-list for Durus 3.6 (20061031)

My wish-list for Durus 3.6 (20061031)
2006-10-31
Jesus Cea
2006-10-31
David Binger
2006-10-31
Jesus Cea
2006-10-31
Jesus Cea
My wish-list for Durus 3.6 (20061031)
David Binger
2006-10-31
On Oct 31, 2006, at 11:59 AM, Jesus Cea wrote:

> 1. Be able to play with threads in the same "safe" ways that Durus
> pre-3.5. (would be nice if available in durus 3.5.1 :-). The patch
> sent
> by David when 3.5 came out seems to work fine.

I don't remember the details of that, but I'll assume that we have this
covered for the next release.

>
> 2. Fully implement "self.storage.sync()" support in
> serverstorage/filestorage.

I think I have this working now in the devel code.  In particular,
I can run a StorageServer whose underlying storage is itself
a ClientStorage connected to another StorageServer, and I can run
the stress test on connections to a master and multiple secondary
storage servers.

>
> - - Notify clients of objects garbage collected, to update their
> cache to
> a) reclaim memory early and b) to avoid "resurrecting" a dead object
> (for example, keeping (incorrectly) a nonpersistent reference around).

I have FileStorage reporting removed oids on the first call after a
pack,
so we will have this early notification in the "normal" durus client/
server
configuration.

> - - Several serverstorage instances in a multithread server could
> share a
> single backend instance (if it is thread-safe). You could use a server
> instance for client, for example, allowing multiple read requests in
> parallel (if the backend allows). The syncronization point would be
> the
> backend (inside the "begin" method), so commits would be serialized
> correctly.

You can work out those "ifs".

> - - Be able to "replicate" a durus storage via the backend ability to
> propagate changes to other (remote) processes. For example, BerkeleyDB
> supports replication natively.

Replication does not really require any code changes.
A replication process can transfer objects from one storage
to another, and use invalidations from the master to replicate
to the mirror.  This has always been possible in Durus, and
pretty simple.

With the server-sync, there will be new options for this,
as you say.

>
> - - Be able to deploy durus "cache only" storageservers: the storage
> backend would be a durus client of another "real" durus server. That
> backend should be able to propagate invalidations upstream to its
> clients.

Yes this will be easier.

> - - Be able to implement multithreaded filestorage processes,
> sharing a
> single backend but keeping separate caches/connections to it. Remember
> that a single storage pool can be only accessed by a single
> filestorage
> instance. This improvement would allow to open several filestorages to
> the same data. Useful if you have a multithreaded apprication and
> can't
> share a single filestorage instance (very problematic with current
> durus
> 3.5).

That sounds dangerous to me.

>
> 3. "connection.get()" does two database requests: one to get the
> object
> type and to create the ghost object in cache, and other access to
> actually populate the cache when the object is "touched". This is not
> usually a problem because the only object requested two times is
> "root";
> all the others are reachable from there, and when traversing links we
> already know the next object type (in fact, that object would
> already be
> a ghost).
>
> Nevertheless I sometimes use "connection.get()" like a "weak
> reference"
> between objects. That is, instead of keeping an object link, I keep an
> object OID and then I try to access it just like a python weak
> reference. This "weak reference" doesn't preclude the object of being
> garbage collected, like usual "weak references". The usage case is the
> same that python weak references: keeping references to objects
> without
> precluding garbage collection.
>
> I feel that needing two database accesses is an actual bug masked by
> current durus usage profile, but revealed by my "weak ref" pattern.

I think we've changed it so that get() only takes one load.

Don't use connection.get() like a weak reference.  If you want
weak references, use your own application's identifiers.
Don't bet your application that Durus identifiers won't change.

>
> 4. "connection" objects should provide a method to query how many
> cache
> misses we had since that connection instantiation. With this
> functionality, client could tune its cache size "automagically". To be
> able to see "accumulated" idle time waiting for a remote object to
> come
> would be nice too.

Nothing has done on any of this, but it seems possible that
it might get done.

>
> 5. A "mutable data changed" notificator for BTree.

Done.

>
> 6. Add a "close" method to storage backend interface. This could be
> able
> to do things like file locks cleanup, background thread stopping, or
> replication exclusion.

Done.

>
> 7. "Factorize" the durus server socket management (in particular,
> socket
> creation for incomming connections and socket "select") to be able to
> reuse the server code in other communication media, like shared
> memory,
> intraprocess queues or mmaped files.

Nothing done here.

>
> 8. A precompiled DURUS distribution for Windows users. Please!.

It would be nice if someone else would provide this.
I don't think we'll offer binary distributions for any platform
from here.

>
> 9. Be able to raise a "read only" exception when a client request new
> OIDs or try a commit with objects changed. This would allow for
> read-only connections, without setting the entire storage "read-only".
> Also, currently if an storage is read-only, clients will be
> disconnected
> when trying to commit changes, with no real indication of the problem.

I'll try to act on this.


--

Thanks for your ideas.