[Quixote-users] Accept-aware functions via metaclasses

Hi everyone,

I've been writing a Quixote app in a somewhat RESTful style, that needs
to do content negotiation on certain requests. By 'content negotation',
I mean that I need to examine the Accept header in the request, and
return an appropriate response in one of the MIME types declared there.

A recurrent pattern was cropping up in my code:

def _q_index(req):
        acceptlist = req.environ['HTTP_ACCEPT'].split(',')
        if 'text/html' in acceptlist:
                # return a text/html representation
        elif 'text/xml' in acceptlist:
                # return a text/xml representation
        ...

Ugly, hard to maintain, and inaccurate. Aside from the obvious
repetition, the Accept matching was not as expressive as it should have
been. For example, I wanted to accommodate relative quality factors (RFC
2616, sec. 14.1). So that the Accept directive:

        text/plain;q=.5, text/xml, */*;q=.1

would resolve to a 'text/xml' response, because it has a default quality
of 1 and is therefore favourable to the client.

What I really needed, then, was a function or method that could
self-resolve the appropriate Content-Type to return, and act accordingly.

I came up with a plan, incanted the forbidden secret phrases and
descended into the realm of metaclass hackery. (Beware, my code is alpha
0.1, unoptimized and likely shows the shakiness of one who does little
metaclass stuff.)

The enclosed module, 'qclasses', contains a metaclass type,
'acceptfunction', that allows you to write Quixote code like this PTL
example:

        from qclasses import acceptfunction
        from somewhere import some_header, some_footer

        _q_exports = []

        class _q_index:

                __metaclass__ = acceptfunction

                def text_html [html] (cls, req):
                        some_header(req)
                        'An HTML response!'
                        some_footer(req)

                def text_xml [html] (cls, req):
                        ''

                def text_ [plain] (cls, req):
                        'A response to text/* requests'

                def default [plain] (cls, req):
                        req.response.set_content_type('text/plain')
                        'A response to */* requests'

Above, _q_index is a class, but it acts like a function. The metaclass
injects a replacement __new__ method (the "callable" interface of the
class) that does the dispatching.

The decision re: which method to pick is based on relative-quality
weightings, as described above. The 'text_html()' method matches
'text/html' requests, 'text_()' matches 'text/*' requests and default()
matches '*/*' requests. All of these are optional, and if no handler can
be found, a somewhat helpful 406 Not Acceptable response is returned.
Note also that punctuation in MIME names is flattened into underscores,
so 'application/xhtml+xml' is handled by 'application_xthml_xml()'.

There is very little magic in the metaclass. If the request is Accept:
text/html, and you don't have a text_html() method, then it will fail.
It won't fail over to the 'default()', that's only for '*/*'; nor to
'text_()', that's only for 'text/*'. The only magic I added was related
to a common use case... look for 'magic' in the docstrings for more info.

There are copious notes in the docstrings about the implementation, so
I'll stop there. ;-)

Oh, there's another metaclass, qx_namespace, that fixes a problem when
using a class as a Quixote namespace (a '' request tries to "call" the
class, rather than redirecting to PATH_URI + '/'). This is not related
to the accessfunction metaclass, but might be helpful to some. You can
use it like this:

        _q_exports = ['bar']

         ...

        class bar:
                __metaclass__ = qx_namespace

                def _q_index (cls, req):   # becomes a class method
                        ...

and if you visit '/bar' in your browser, you'll be redirected to '/bar/'
instead of getting an undesirable response (e.g. a str(bar())
representation).

Back to acceptfunction. A similar approach could be used to make
'function classes' for functions that need to dispatch based on method
(GET, PUT, ...), user agent, locale, etc. (Though I dread to think about
having to write a _q_index.text_html.put.en_us() function and its
hundred-odd companions!)

One last note: HTTP_ACCEPT support /may/ be inconsistent across all
known Quixote handlers.

Enjoy!

-- Graham

"""
Metaclasses for doing neat tricks with Quixote namespaces and callables.
An early alpha release.
Graham Fawcett, September 2003.
"""

from quixote.errors import PublishError
import types
import re

DEBUG = 0
ALLOW_HTML_DEFAULT = 1

class qx_namespace(type):
    """
    A Quixote-specific metaclass for classes used as Quixote namespaces.
    Supplies a __new__ method that prevents the class from being called
    as a "callable", and instead redirects with a suffixed '/'.

    Example: if your request path is '/foo/bar', and Quixote maps
    'bar' onto a class, then it will try to "call" the class by invoking
    its constructor. The desired behaviour, however, is that the client
    is redirected to '/foo/bar/'. Adding a __new__ method to the class
    short-circuits the constructor call, and returns a Quixote redirect
    instead of a class instance.
    """

    def __init__(cls, name, supers, dct):
        def _redirect(cls, req):
            return req.redirect(req.environ['PATH_INFO'] + '/')
        cls.__new__ = staticmethod(_redirect)




class acceptfunction(type):
    """
    A Quixote-specific metaclass for a class that behaves like a function.
    The function expects a 'request' parameter, and based on
    request.environ['HTTP_ACCEPT'], delegates the response to an appropriate
    handler method within the class.

    The goal is to facilitate content negotiation in RESTful applications.

    Important Note: It is assumed that your request handler is correctly
    setting the HTTP_ACCEPT value in your request.environ dictionary. If
    not found, a meaningful KeyError will be raised. At present, I believe that
support for HTTP_ACCEPT is inconsistent across all Quixote frontends.

    The handler lookup is done as follows:

    - HTTP_ACCEPT is decomposed into MIME types sorted in quality order.
      See RFC 2616 sec 14.1 for more info on quality values.

    - for each MIME type in turn, a suitable handler is requested from the
      class. The lookup is based on method names that match the MIME type
      name. For example, for the MIME type 'text/html', a handler called
      'text_html()' is requested.

    - To handle requests of type 'text/*' you can register a handler with
      the signature 'text_()'.

    - To handle '*/*' requests, you can register a handler for 'default()'.

      Note: I added a bit of magic (just a bit), so that if you do not
      specify a default handler, but you have a handler capable of dealing
      with Accept: text/html, then that processor will be used. This is to
      deal with the common case where (a) most of your uses will be to handle
      text/html and one other format, and (b) where Internet Explorer is used,
      which frequently sends Accept: */* and nothing else.

    - the 'default()' handler is ONLY used to match '*/*'. It is not a
      failover handler.

    - if no handler can be found, then a 406 Not Acceptable error is returned.

    The methods are all automatically converted into class methods,
    so your handler signatures must be in the form 'text_html(cls, request)'.

    Automatic Content-Type setting

      For explicit handlers (like 'text_html()') the class will automatically
      set the MIME type via req.response.set_content_type().

      For wildcard types (like 'text_()' or 'default'), the MIME type will
      be set to class._default_mimetype. If this attribute is not found,
      then 'application/octet-stream' is used as a last resort.

      Of course, you can still set the Content-Type explicitly in your
      handler if you prefer.
    """

    def __init__(cls, name, supers, dct):
        # first set all the attributes.
        # all methods become class methods.
        methods = {}
        for k, v in dct.items():
            if type(v) is types.FunctionType:
                setattr(cls, k, classmethod(v))
                cmeth = getattr(cls, k)
                methods[k] = cmeth
            else:
                setattr(cls, k, v)
        # build a lookup table for methods, based on the
        # MIME types they are intended to support.
        lookup = {}
        for mname, meth in methods.items():
            if not mname.startswith('_') and '_' in mname:
                # this is the signature of a MIME-handling method.
                mtype, msubtype = [part and part or '*' \
                                   for part in mname.split('_', 1)]
                if DEBUG:
                    print 'acceptfunction: adding lookup to %s for %s' % \
                                            (name, (mtype, msubtype))
                lookup[(mtype, msubtype)] = meth

        # try to put the default handler in the lookup
        try:
            lookup[('*','*')] = methods['default']
            if DEBUG:
                print 'acceptfunction: adding default lookup for */*'
        except KeyError:
            # no default handler found.
            if ALLOW_HTML_DEFAULT:
                # you can change this if you want, but this will
                # set the default to the most likely text/html handler,
                # if one can be found. Why is this here? Frankly, because
                # of InternetExplorer, which frequently sends
                # Accept: */* as its directive.
                try:
                    lookup[('*','*')] = \
                                get_best_handler('text/html', lookup)[0]
                    if not hasattr(cls, '_default_mimetype'):
                        cls._default_mimetype = 'text/html'
                except NoHandlerFor:
                    pass

        # "calling" a class looks like an instantiation requst.
        # So, we create a __new__ staticmethod that dispatches
        # requests to the appropriate method.
        cls.__new__ = staticmethod(_accept_dispatch)
        cls._dispatch_lookup = lookup
        return cls

def _accept_dispatch(cls, req):
    """
    A drop-in '__new__' method for acceptfunction-derived classes.
    Does the actual dispatching to the best class method for the
    request's Accept directive.
    """
    if DEBUG:
        print '_accept_dispatch: ', '-' * 50

    try:
        accept = req.environ['HTTP_ACCEPT']
    except KeyError:
        raise KeyError, 'HTTP_ACCEPT not found in request environ!'

    handler, helement = get_best_handler(accept, cls._dispatch_lookup)
    # as a convenience, set the content-type on the response
    # based on the mime type of the handler.
    # if using a wildcard handler, then use
    # cls._default_mimetype, or a suitable catch-all.
    mimetype = str(helement)
    if '*' in mimetype:
        mimetype = getattr(cls, '_default_mimetype',
                                'application/octet-stream')
    req.response.set_content_type(mimetype)
    if DEBUG:
        print '_accept_dispatch: mimetype is', mimetype
    return handler(req)




def get_best_handler(accept, lookup):
    """
    Given an Accept directiveand a lookup table, find the best handler
    for the Accept directive.
    """
    acceptlist = _formats_accepted(accept)
    handler = None
    best_element = None
    if DEBUG:
        print 'get_best_handler: acceptlist is', acceptlist
    for acc in acceptlist:
        if DEBUG:
            print 'get_best_handler: searching for', acc.key()
        if lookup.has_key(acc.key()):
            best_element = acc
            break
    else:
        raise NoHandlerFor(accept, lookup.keys())
    handler = lookup[best_element.key()]
    if DEBUG:
        print 'get_best_handler: best for %s is %s' % (accept, best_element)
    return (handler, best_element)




def _formats_accepted(accept):
    """
    Given an HTTP Accept header value (see RFC 2616, section 14.1),
    return a sorted list of allowed types, in quality order.

    Return value is a list of tuples of ('type', 'subtype', 'subkey').
    Subkey is just the subtype with punctuation characters replaced with
    underscores, to facilitate lookup.
    """
    # add defaults, and sort the list.
    sortlist = []
    if DEBUG:
        print '_formats_accepted: examining', accept
    elements = [_accept_element(e) for e in accept.split(',')]
    elements.sort()
    return elements



class _accept_element:
    """
    An element in an Accept string.
    Examples in text would include 'text/html', 'text/*', '*/*;q=.3'
    This class provides attributes/methods to support sorting and
    lookups based on Accept elements.
    """
    def __init__(self, acceptstring):
        if ';' in acceptstring:
            mimetype, qstring = acceptstring.split(';')
            self.quality = float(qstring.split('=')[1])
        else:
            mimetype = acceptstring
            self.quality = 1.0 # the RFC 2616 default
        self.mimetype = mimetype.replace(' ', '')
        self.supertype, self.subtype = self.mimetype.split('/')
        self.subkey = re.sub('[+-]', '_', self.subtype)
    def __repr__(self):
        return self.mimetype

    def key(self):
        """
        Return a key for looking up an appropriate handler method.
        """
        return (self.supertype, self.subkey)

    def __cmp__(self, other):
        """
        When sorting a list of elements, we want the high-quality
        values at the front of the list.
        """
        return cmp(other.quality, self.quality)




class NoHandlerFor(PublishError):
    """
    A not-very-good 406 Not Acceptable error.
    It's okay in that it gives detail about what handlers are available,
    but not in a machine readable format; so it doesn't work as a
    discovery method. See RFC 2616, sec 10.4.7 for more info.
    """
    status_code = 406 # Not Acceptable
    title = 'Not Acceptable'
    private_msg = ''
    def __init__(self, acceptstring, accepted_types):
        self.description = ('Could not satisfy the requested '
                            'content-types: %s' % acceptstring)
        self.accepted_types = accepted_types
        self.public_msg = ','.join(map(str, self.accepted_types))
    def format (self, request):
        return self.public_msg