Re: [Durus-users] PersistentDict vs BTree

On 10/5/06, Jesus Cea  wrote:
> That is right. There are a lot of things Mike Orr must evaluate: For
> example, if any process is modifying the PersistentDict, readers must
> fetch it entire again (31 MB), while a BTree will only reload the
> modified nodes.

It's truly read-only.  If the data is updated, I assemble the database
on another computer, and stop the server while I'm copying it over.
The reason it's read-only is to avoid needing file-write permission in
the standalone version  people run on laptops. Some of our target user
organizations are very strict about what software can be installed.
(It does actually use a browser cookie, and users can download their
data to a file and re-upload it later to the application, but that's
different because the user is explicitly doing this through the
browser.)

> Also a PersistentDict commit will write 31 MB to
> storage, for example.

I get that now from the import routine, and it takes thirty seconds. I
build the entire database before committing it to eliminate any
obsolete objects wasting space in the database.  Though I do pack it
afterwards anyway.

> The Durus 3.5 "bulk object loader" method would probe useful here.

I'll try David's crawler suggestion and see if that improves anything.

One thing I've been stunned about is the advanced search, which I just
finished writing.  It goes through every record aplying arbitrary
evaluators to it (field1 contains "foo"? field2 > 5.0?), and it takes
less than three seconds even with complex criteria that return 1000+
results. Something in the simple search is bogging down with terms
that return a lot of results (e.g., "lor" for chlorine, chlorazol,
etc), in spite of the index dictionaries/tables I wrote to speed it
up. So I may have to implement the simple search in terms of the
advanced search, pulling all records and not using indexes,
counterintiutive as it sounds.

--
Mike Orr