[Quixote-users] Quixote programming notes

Quixote programming notes
2001-06-07
Andrew Kuchling
2001-06-08
2001-06-08
2001-06-08
2001-06-08
Quixote programming notes
Andrew Kuchling
2001-06-07
I've just checked the following in as doc/programming.txt.
Comments?

And is there anything else that needs to be done before
we can make a 0.3 release?

--amk


Programming Overview
====================

This file explains how a Quixote application is structured, using a
sample application called 'books'.

There are three components to a Quixote application:

1) A FastCGI script, usually called books.fcgi.  This script will
   create a Publisher object, and may customize the publisher in some
   application-specific way.  It will also tell the publisher what the
   root package name for the application is; for example, at the MEMS
   Exchange it's 'mems.ui', because all our Web code lives in the
   mems.ui package.  For our books example, we'll call the package
   simply 'books'.

   FastCGI isn't an absolute requirement.  It should be trivially easy
   to run Quixote apps using regular CGI, but this would be quite slow
   due to the number of modules that need to be imported at startup
   time.  We haven't tried to run Quixote under mod_python/mod_snake,
   but it shouldn't be too difficult; if you try it, please let us
   know how it goes.

2) A configuration file, usually called books.conf.  This file sets
   various features of the Publisher class, such as how errors are
   handled, the paths of various log files, and various other things.
   Read through quixote/config.py for the full list of configuration
   settings.

   The most critical configuration parameters are:
      URL_PREFIX        Prefix of URLs that will be directed to Quixote.
      ERROR_EMAIL       E-mail address to which errors will be mailed
      ERROR_LOG         File to which errors will be logged

3) Finally, the bulk of the code will be in a Python package; the
   Publisher class will be set up to start traversing at the package's
   root.  For our books example, the package will be named simply
   'books', so Python code could do 'from books import foo,bar'.


FastCGI script
==============

The FastCGI script can be very simple:

        books.fcgi
        ----------
#!/usr/bin/python
from quixote.config import Config
from quixote.publish import Publisher

PACKAGE_NAME = 'books'

def main():
    # Create configuration object with default values
    config = Config()

    # Read a configuration file
    config.read_file('/www/conf/books.conf')

    # Create a Publisher instance using the configuration
    pub = Publisher(PACKAGE_NAME, config)

    # Install the PTL import hook
    pub.install_imphooks()

    # Enter the publishing main loop
    pub.publish_cgi()

if __name__ == '__main__':
    main()

That's the simplest possible case.  The SessionPublisher class
in quixote.publish can also be used; it provides session tracking.
The changes required to use SessionPublisher would be:

...
from quixote.publish import SessionPublisher
from quixote.session import SessionManager
...
    pub = SessionPublisher(PACKAGE_NAME, config)
    pub.set_session_manager( SessionManager() )
...

It's also possible to subclass the Publisher or SessionManager classes
in order to provide some specialized behaviour necessary for your
application.  Some uses for this would be:

      * SessionManager stores user session in an in-memory dictionary,
        so restarting the FastCGI process will cause all the sessions
        to be lost.  The FastCGI process is terminated if the UI code
        raises an uncaught exception, so if sessions are used to
        contain important data, such as the contents of a shopping
        cart or a MEMS process sequence, it's better if sessions are
        stored in some persistent way.

      * If you're using the ZODB to store data, after each request
        you'll want to commit the current transaction if everything
        ran smoothly, or abort the transaction if an exception was
        raised.

      * The default behaviour on an uncaught exception is to record
        the time, the traceback, and the contents of the request.
        This is written to the configured error log (the ERROR_LOG
        parameter), and mailed to the configured e-mail address (the
        ERROR_EMAIL parameter).  If you wanted some different
        behaviour, you would have to subclass Publisher or
        SessionPublisher and override the finish_failed_request()
        method.

This file won't try to explain how to write subclasses of the Quixote
classes; read the docstrings in the code for detailed explanations.


Configuration file
==================

In the example books.fcgi script, configuration information is read
from a file by this line of code:
    config.read_file('/www/conf/books.conf')

You should never edit the default values in quixote/config.py, because
your edits will be lost if you upgrade to a newer Quixote version.
You should certainly read it, though, to understand what all the configuration
parameters are.

The configuration file contains Python code, which is then evaluated
using Python's built-in function execfile().  Variable assignments are
performed within the Config object's dictionary, so it's easy to set
values:

        books.conf
        ----------
ACCESS_LOG = "/www/log/access/quixote.log"
DEBUG_LOG = "/www/log/quixote-debug.log"
ERROR_LOG = "/www/log/quixote-error.log"

You can also execute arbitrary Python code to figure out what the
variables should be.  The following example changes some settings to
be more convenient for a developer when the MX_MODE environment
variable is the string 'DEVEL':

mx_mode = os.environ["MX_MODE"]
if mx_mode == "DEVEL":
    DISPLAY_EXCEPTIONS = 1
    SECURE_ERRORS = 0
    RUN_ONCE = 1
elif mx_mode in ("STAGING", "LIVE"):
    DISPLAY_EXCEPTIONS = 0
    SECURE_ERRORS = 1
    RUN_ONCE = 0
else:
    raise RuntimeError, "unknown server mode: %s" % mx_mode

We use this flexibility to display tracebacks in DEVEL mode, to
redirect generated e-mails to a staging address in STAGING mode, and
to enable all features in LIVE mode.


'books' Package
===============

Finally, we reach the most complicated part of a Quixote application.
However, thanks to Quixote's design, everything you've ever learned
about designing and writing Python code should be applicable, so there
are no new hoops to jump through.

An application's code lives in a Python package that contains both .py
and .ptl files.  Complicated logic should be in .py files, while .ptl
files, ideally, should contain only the logic needed to render your
Web interface and basic objects as HTML.

Quixote's publisher will start at the root of this package, and will
treat the rest of the URL as a path into the package's contents.  Here
are some examples, assuming that the URL_PREFIX is '/q', and the root
package is 'books'

http://.../q/             will call    books._q_index()
http://.../q/other        will call    books.other(), if books.other
                                       is a function.
http://.../q/other        will call    books.other._q_index(), if books.other
                                       is a module or a subpackage.

One of PTL's design goals is "Be explicit."  Therefore there's no
complicated rule for remembering which functions in a module are
public; you just have to list them all in the _q_exports variable,
which should be a list of strings naming the public functions.  You
don't need to list the _q_index function as being public; that's
assumed.

        books/__init__.py
        -----------------

_q_exports = ["other"]

from pages import _q_index

def other(request):
    return "Handled by a Python function."

When a function is callable from the Web, it must expect a single
parameter, which will be an object containing the contents of the HTTP
request.  'request' will be an instance of the HTTPRequest class, and
provides methods for reading form values, environment variables, and
the usual CGI-ish data.  When using SessionPublisher, request.session
is a Session object for the user agent making the request.

Use 'pydoc quixote.zope.HTTPRequest' to get a full listing of
HTTPRequest's methods.

The function must return either a string or a TemplateIO object; PTL
templates return a TemplateIO object.  request.response is an
HTTPResponse instance, which has methods for setting the content-type
of the function's output, generating an HTTP redirect, specifying
arbitrary HTTP response headers, and other common tasks.  Use 'pydoc
quixote.zope.HTTPResponse' to get a full listing of HTTPResponse's
methods.

There are two (and *only* two) ways to affect the Publisher's
traversal.

_q_access(request)

   If this function is present in a module, it will be called before
   attempting to traverse any further.  It can look at the contents of
   request and decide if the traversal can continue; if not, it should
   raise quixote.errors.AccessError (or a subclass), and Quixote will
   return a 403 Forbidden HTTP status code.  The return value is
   ignored if _q_access() doesn't raise an exception.

   For example, in the MEMS Exchange code, we have some sets of pages
   that are only accessible to signed-in users of a certain type.  The
   _q_access() function looks like this:

def _q_access (request):
    if request.session.user is None:
        raise NotLoggedInError, ("You must be signed in to view reports.")
    if not (request.session.user.is_MX() or
            request.session.user.is_fab()):
        raise MXAccessError, ("Only MEMS Exchange and fab staff can view "
                              "reports.")

   This is less error-prone than having to remember to add checks to
   every single public function.


_q_getname(request, component)

   This function translates an arbitrary string into an object that we
   continue traversing.  This is very handy; it lets you put
   user-space objects into your URL-space, eliminating the need for
   digging ID strings out of a query, or checking PATHINFO after
   Quixote's done with it.  But it is a compromise with security: it
   opens up the traversal algorithm to arbitrary names not listed in
   _q_exports.  You should therefore be extremely paranoid about
   checking the value of 'component'.

   'request' is the request object, as it is everywhere else;
   'component' is a string containing the next chunk of the path.
   _q_getname() should return some object that can be traversed
   further, so it should have a _q_index() method, a _q_exports
   attribute, and optionally _q_access() or its own _q_getname().
   We generally write special classes for this purpose, though you
   could choose a particular module and return that instead.

   For example, we want people to be able to go to
   http://.../q/run/250/ to view run #250.  This is more readable than
   the alternatives '/q/run/?id=250' or even '/q/run?250'.  The
   corresponding function and class look like this:

def _q_getname (request, component):
    return RunUI(request, component)

class RunUI:
    _q_exports = ['details']

    def __init__ (self, request, component):
        run_db = get_run_database()
        self.run = run_db.get_run(run_id, run_version)
        if not self.run.can_access(request.session.user):
            raise MXAccessError("You are not allowed to access run %d." %
                                run_id)

    def _q_index (self, request):
        ...
    def details (self, request):
        ...

The __init__() method is actually much longer, and is very paranoid
about checking whether the value of 'component' is actually a number,
if the run exists, and if the user is permitted to view that run.


--
A.M. Kuchling    
Neil Schemenauer 
Greg Ward