durusmail: quixote-users: Performance of DirMapping
Performance of DirMapping
2005-02-18
2005-02-18
Performance of DirMapping
mso@oz.net
2005-02-18
I was converting the DirMapping session store to MySQLdb, but then started
wondering if it was worth it.  My application will have thirty concurrent
users at peak, or maybe fifty to allow for expansion, so it's not high
load.
My main concern is that DirMapping doesn't deal with concurrency in case
the user has two browser windows open in the same session and clicks in
both simultaneously.  I put 'flock' calls after the opens, which should
deal with concurrency.

    class DirMapping:
        def __getitem__(self, session_id):
            file = open(filename, "rb")
            fcntl.flock(file.fileno(), fcntl.LOCK_SH)
            try:
                ...
            finally:
                fcntl.flock(file.fileno(), fcntl.LOCK_UN)
                file.close()

        def __setitem__(self, session_id, session):
            file = open(filename, "wb")
            fcntl.flock(file.fileno(), fcntl.LOCK_EX)
            ...
            fcntl.flock(file.fileno(), fcntl.LOCK_UN)
            fcntl.close()

I haven't used flock before but it seems to be working.  One concern is
I'm developing this on Linux but it needs to be portable to Mac OS X and
Windows.  The library manual doesn't say it's Unix-only, so hopefully it
will work.

One thing I like about DirMapping is its in-memory cache that bypasses the
pickle.  That's the snag I got in the MySQL version.  I was porting the
caching code, and realized DirMapping compares the file's mtime to the
cached mtime.  The only way to do that in MySQL is with an extra database
query, which kind of defeats the purpose of using the database, since I'm
already using more queries than efficient to avoid joins.  Either that or
cache the database's mtime too.  But I'm not even sure why it's comparing
the mtime.  Is there a normal case where the cache would be invalid, or
only if something external messed with the file?

I looked at the PostgreSQL session store on the wiki, but it seems overly
complex, storing the user ID and remote IP and encompassing several
modules.    I want something simple and reliable, not something fancy and
maybe less reliable.  For the same reason I'm not interested in the
SQLObject versions.

I'm using MySQL rather than another database because it's needed for the
application anyway.

So is it worth converting to database sessions or just stick with
DirMapping + flock?  Is there anything in my flock code that might be a
problem?


reply