Greetings Quixote and QP users, [tl;dr: see the patch below] Recently I ran into trouble with certain client and the problem appears to be caused by unexpected caching of dynamic resources. I haven't been able to conclusively determine the problem yet, but while looking into it, I realized that Quixote's use of the Expires header is not sufficient to prevent unwanted caching (QP inherited the same code). Has anyone else seen caching problems "in the wild" where "Expires: -1" is not sufficient? Specifically, RFC 2616 (HTTP 1.1) recognizes the Expires header but allows clients and proxies to ignore it (i.e. use a stale copy) in certain cases (section 13.1.5 and 14.9.4). You must set the must-revalidate directive of the Cache-Control header to prevent this. On the one hand, this seems like a kind of "do it harder" stupidity but I can imagine use cases. For example, if CNN's news page expired in 10 minutes but your connection was horribly slow then allowing stale information for say 1 hour would be reasonable. In any case, there is not point complaining about the design, servers must work with the web as it is, not as we would like. The simplest thing to do would be to set both "Expires: -1" and "Cache-Control: no-cache" for dynamically generated pages. Modern browsers should do what we expect. Unfortunately, it looks like older browsers conflated the cache with the history mechanism (e.g. the back button). IMHO, the back button should always work. RFC 2616 is quite clear, section 13.13: [...]History mechanisms and caches are different. In particular history mechanisms SHOULD NOT try to show a semantically transparent view of the current state of a resource. Rather, a history mechanism is meant to show exactly what the user saw at the time when the resource was retrieved. By default, an expiration time does not apply to history mechanisms. If the entity is still in storage, a history mechanism SHOULD display it even if the entity has expired, unless the user has specifically configured the agent to refresh expired history documents.[...] Unfortunately older versions of IE do the wrong thing and prevent you from using the back button if no-cache is set. It looks like IE >= 5 fixed that problem: http://support.microsoft.com/kb/199805 Since IE 4 is positively prehistoric, maybe we don't care and should just use no-cache. My proposed change is to keep the Expires header and add a Cache-Control header with a max-age and and a must-validate directive if max-age is 0. I think this should be sufficient to prevent detrimental caching and also shouldn't provoke any weird browser behavior. See the patch below. Neil --- a/quixote/http_response.py +++ b/quixote/http_response.py @@ -124,10 +124,10 @@ class HTTPResponse: future requests. The cookie value is stored as the "value" attribute. The other attributes are as specified by RFC 2109. cache : int | None - the number of seconds the response may be cached. The default is 0, - meaning don't cache at all. This variable is used to set the HTTP - expires header. If set to None then the expires header will not be - added. + the number of seconds the response may be cached. The default + is 0, meaning don't cache at all. This variable is used to set + the HTTP expires and cache-control headers. If set to None then + no headers will not be added. javascript_code : { string : string } a collection of snippets of JavaScript code to be included in the response. The collection is built by calling add_javascript(), @@ -138,7 +138,7 @@ class HTTPResponse: DEFAULT_CONTENT_TYPE = 'text/html' DEFAULT_CHARSET = None # defaults to quixote.DEFAULT_CHARSET - + def __init__(self, status=200, body=None, content_type=None, charset=None): """ Creates a new HTTP response. @@ -412,14 +412,37 @@ class HTTPResponse: # Cache directives if self.cache is None: - pass # don't mess with the expires header - elif "expires" not in self.headers: + pass # don't mess with the expires or cache control header + else: + # We add both an Expires header and a Cache-Control header + # with a max-age directive. The max-age directive takes + # priority when both Expires and max-age are present (even + # if Expires is more restrictive, RFC 2616 section 14.9.3). if self.cache > 0: expire_date = formatdate(now + self.cache) + cache_control = "max-age=%d" % self.cache else: - expire_date = "-1" # allowed by HTTP spec and may work better - # with some clients - headers.append(("Expires", expire_date)) + # The is the default case and makes sense for a dynamically + # generated response that can change on each request. + # + # Using the current date is not a good idea since clocks + # might not be synchronized. Any invalid date is treated + # as in the past but Microsoft recommends "-1" for + # Internet Explorer so that's what we use. + expire_date = "-1" + # The Expires header is sufficient for HTTP 1.0 but + # for HTTP 1.1 we must add a must-revalidate directive. + # Clients and proxies are allowed to ignore Expires in + # certain cases and use stale pages (RFC 2616 sections + # 13.1.5 and 14.9.4). + cache_control = "max-age=0, must-revalidate" + if ("expires" not in self.headers and + "cache-control" not in self.headers): + # If either of these headers are set then don't add + # any of them. We assume the programmer knows what he + # is doing in that case. + headers.append(("Expires", expire_date)) + headers.append(("Cache-Control", cache_control)) # Content-type if "content-type" not in self.headers: