I've just checked the following in as doc/programming.txt.
Comments?
And is there anything else that needs to be done before
we can make a 0.3 release?
--amk
Programming Overview
====================
This file explains how a Quixote application is structured, using a
sample application called 'books'.
There are three components to a Quixote application:
1) A FastCGI script, usually called books.fcgi. This script will
create a Publisher object, and may customize the publisher in some
application-specific way. It will also tell the publisher what the
root package name for the application is; for example, at the MEMS
Exchange it's 'mems.ui', because all our Web code lives in the
mems.ui package. For our books example, we'll call the package
simply 'books'.
FastCGI isn't an absolute requirement. It should be trivially easy
to run Quixote apps using regular CGI, but this would be quite slow
due to the number of modules that need to be imported at startup
time. We haven't tried to run Quixote under mod_python/mod_snake,
but it shouldn't be too difficult; if you try it, please let us
know how it goes.
2) A configuration file, usually called books.conf. This file sets
various features of the Publisher class, such as how errors are
handled, the paths of various log files, and various other things.
Read through quixote/config.py for the full list of configuration
settings.
The most critical configuration parameters are:
URL_PREFIX Prefix of URLs that will be directed to Quixote.
ERROR_EMAIL E-mail address to which errors will be mailed
ERROR_LOG File to which errors will be logged
3) Finally, the bulk of the code will be in a Python package; the
Publisher class will be set up to start traversing at the package's
root. For our books example, the package will be named simply
'books', so Python code could do 'from books import foo,bar'.
FastCGI script
==============
The FastCGI script can be very simple:
books.fcgi
----------
#!/usr/bin/python
from quixote.config import Config
from quixote.publish import Publisher
PACKAGE_NAME = 'books'
def main():
# Create configuration object with default values
config = Config()
# Read a configuration file
config.read_file('/www/conf/books.conf')
# Create a Publisher instance using the configuration
pub = Publisher(PACKAGE_NAME, config)
# Install the PTL import hook
pub.install_imphooks()
# Enter the publishing main loop
pub.publish_cgi()
if __name__ == '__main__':
main()
That's the simplest possible case. The SessionPublisher class
in quixote.publish can also be used; it provides session tracking.
The changes required to use SessionPublisher would be:
...
from quixote.publish import SessionPublisher
from quixote.session import SessionManager
...
pub = SessionPublisher(PACKAGE_NAME, config)
pub.set_session_manager( SessionManager() )
...
It's also possible to subclass the Publisher or SessionManager classes
in order to provide some specialized behaviour necessary for your
application. Some uses for this would be:
* SessionManager stores user session in an in-memory dictionary,
so restarting the FastCGI process will cause all the sessions
to be lost. The FastCGI process is terminated if the UI code
raises an uncaught exception, so if sessions are used to
contain important data, such as the contents of a shopping
cart or a MEMS process sequence, it's better if sessions are
stored in some persistent way.
* If you're using the ZODB to store data, after each request
you'll want to commit the current transaction if everything
ran smoothly, or abort the transaction if an exception was
raised.
* The default behaviour on an uncaught exception is to record
the time, the traceback, and the contents of the request.
This is written to the configured error log (the ERROR_LOG
parameter), and mailed to the configured e-mail address (the
ERROR_EMAIL parameter). If you wanted some different
behaviour, you would have to subclass Publisher or
SessionPublisher and override the finish_failed_request()
method.
This file won't try to explain how to write subclasses of the Quixote
classes; read the docstrings in the code for detailed explanations.
Configuration file
==================
In the example books.fcgi script, configuration information is read
from a file by this line of code:
config.read_file('/www/conf/books.conf')
You should never edit the default values in quixote/config.py, because
your edits will be lost if you upgrade to a newer Quixote version.
You should certainly read it, though, to understand what all the configuration
parameters are.
The configuration file contains Python code, which is then evaluated
using Python's built-in function execfile(). Variable assignments are
performed within the Config object's dictionary, so it's easy to set
values:
books.conf
----------
ACCESS_LOG = "/www/log/access/quixote.log"
DEBUG_LOG = "/www/log/quixote-debug.log"
ERROR_LOG = "/www/log/quixote-error.log"
You can also execute arbitrary Python code to figure out what the
variables should be. The following example changes some settings to
be more convenient for a developer when the MX_MODE environment
variable is the string 'DEVEL':
mx_mode = os.environ["MX_MODE"]
if mx_mode == "DEVEL":
DISPLAY_EXCEPTIONS = 1
SECURE_ERRORS = 0
RUN_ONCE = 1
elif mx_mode in ("STAGING", "LIVE"):
DISPLAY_EXCEPTIONS = 0
SECURE_ERRORS = 1
RUN_ONCE = 0
else:
raise RuntimeError, "unknown server mode: %s" % mx_mode
We use this flexibility to display tracebacks in DEVEL mode, to
redirect generated e-mails to a staging address in STAGING mode, and
to enable all features in LIVE mode.
'books' Package
===============
Finally, we reach the most complicated part of a Quixote application.
However, thanks to Quixote's design, everything you've ever learned
about designing and writing Python code should be applicable, so there
are no new hoops to jump through.
An application's code lives in a Python package that contains both .py
and .ptl files. Complicated logic should be in .py files, while .ptl
files, ideally, should contain only the logic needed to render your
Web interface and basic objects as HTML.
Quixote's publisher will start at the root of this package, and will
treat the rest of the URL as a path into the package's contents. Here
are some examples, assuming that the URL_PREFIX is '/q', and the root
package is 'books'
http://.../q/ will call books._q_index()
http://.../q/other will call books.other(), if books.other
is a function.
http://.../q/other will call books.other._q_index(), if books.other
is a module or a subpackage.
One of PTL's design goals is "Be explicit." Therefore there's no
complicated rule for remembering which functions in a module are
public; you just have to list them all in the _q_exports variable,
which should be a list of strings naming the public functions. You
don't need to list the _q_index function as being public; that's
assumed.
books/__init__.py
-----------------
_q_exports = ["other"]
from pages import _q_index
def other(request):
return "Handled by a Python function."
When a function is callable from the Web, it must expect a single
parameter, which will be an object containing the contents of the HTTP
request. 'request' will be an instance of the HTTPRequest class, and
provides methods for reading form values, environment variables, and
the usual CGI-ish data. When using SessionPublisher, request.session
is a Session object for the user agent making the request.
Use 'pydoc quixote.zope.HTTPRequest' to get a full listing of
HTTPRequest's methods.
The function must return either a string or a TemplateIO object; PTL
templates return a TemplateIO object. request.response is an
HTTPResponse instance, which has methods for setting the content-type
of the function's output, generating an HTTP redirect, specifying
arbitrary HTTP response headers, and other common tasks. Use 'pydoc
quixote.zope.HTTPResponse' to get a full listing of HTTPResponse's
methods.
There are two (and *only* two) ways to affect the Publisher's
traversal.
_q_access(request)
If this function is present in a module, it will be called before
attempting to traverse any further. It can look at the contents of
request and decide if the traversal can continue; if not, it should
raise quixote.errors.AccessError (or a subclass), and Quixote will
return a 403 Forbidden HTTP status code. The return value is
ignored if _q_access() doesn't raise an exception.
For example, in the MEMS Exchange code, we have some sets of pages
that are only accessible to signed-in users of a certain type. The
_q_access() function looks like this:
def _q_access (request):
if request.session.user is None:
raise NotLoggedInError, ("You must be signed in to view reports.")
if not (request.session.user.is_MX() or
request.session.user.is_fab()):
raise MXAccessError, ("Only MEMS Exchange and fab staff can view "
"reports.")
This is less error-prone than having to remember to add checks to
every single public function.
_q_getname(request, component)
This function translates an arbitrary string into an object that we
continue traversing. This is very handy; it lets you put
user-space objects into your URL-space, eliminating the need for
digging ID strings out of a query, or checking PATHINFO after
Quixote's done with it. But it is a compromise with security: it
opens up the traversal algorithm to arbitrary names not listed in
_q_exports. You should therefore be extremely paranoid about
checking the value of 'component'.
'request' is the request object, as it is everywhere else;
'component' is a string containing the next chunk of the path.
_q_getname() should return some object that can be traversed
further, so it should have a _q_index() method, a _q_exports
attribute, and optionally _q_access() or its own _q_getname().
We generally write special classes for this purpose, though you
could choose a particular module and return that instead.
For example, we want people to be able to go to
http://.../q/run/250/ to view run #250. This is more readable than
the alternatives '/q/run/?id=250' or even '/q/run?250'. The
corresponding function and class look like this:
def _q_getname (request, component):
return RunUI(request, component)
class RunUI:
_q_exports = ['details']
def __init__ (self, request, component):
run_db = get_run_database()
self.run = run_db.get_run(run_id, run_version)
if not self.run.can_access(request.session.user):
raise MXAccessError("You are not allowed to access run %d." %
run_id)
def _q_index (self, request):
...
def details (self, request):
...
The __init__() method is actually much longer, and is very paranoid
about checking whether the value of 'component' is actually a number,
if the run exists, and if the user is permitted to view that run.
--
A.M. Kuchling
Neil Schemenauer
Greg Ward