Jason Sibre wrote:
> On Thursday, August 19, 2004 4:41 PM Neil Schemenauer wrote:
>
>>Has anyone tried this? I'm wondering if this is a safe change to
>>make in the Quixote 1.x series.
>>
>
> No!
> Um... This went in one ear and out the other when Graham first posted this,
> but I don't think this is a good idea.
>
> [Disclaimer, I haven't tried the patch. I've only read the code]
>
> The original code didn't "get rid of" the content-type and content-length
> headers, it just skipped the mangling step of prepending the "HTTP_" to
> them, inserting them as "CONTENT_TYPE" and "CONTENT_LENGTH" (instead of
> "HTTP_CONTENT_TYPE" and "HTTP_CONTENT_LENGTH" which is what the patch will
> cause)... They aren't supposed to have the HTTP_ prefix. They're 'real'
> CGI environment variables. The HTTP_ prefix is something that's done to all
> the 'extra' headers that come across that the CGI spec doesn't specifically
> address.
Hold on... I'm a bit tired, and may be missing something, but I don't think
that's quite what's happening.
Here's a larger snippet of medusa_http.py:
environ = {'REQUEST_METHOD': request.command,
'ACCEPT_ENCODING': msg.get('Accept-encoding'),
'CONTENT_TYPE': msg.get('Content-type'),
'CONTENT_LENGTH': len(data),
"GATEWAY_INTERFACE": "CGI/1.1",
'HTTP_COOKIE': msg.get('Cookie'),
'HTTP_REFERER': msg.get('Referer'),
'HTTP_USER_AGENT': msg.get('User-agent'),
'PATH_INFO': path,
'QUERY_STRING': query_string,
'REMOTE_ADDR': remote_addr,
'REMOTE_PORT': str(remote_port),
'REQUEST_URI': request.uri,
'SCRIPT_NAME': '',
"SCRIPT_FILENAME": '',
'SERVER_NAME': self.server.ip or socket.gethostname(),
'SERVER_PORT': str(self.server.port),
'SERVER_PROTOCOL': 'HTTP/1.1',
'SERVER_SOFTWARE': self.server_name,
}
# Propagate HTTP headers - copied from twisted_http.py
for title, header in msg.items():
envname = title.replace('-', '_').upper()
if title not in ('content-type', 'content-length'):
envname = "HTTP_" + envname
environ[envname] = header
for k,v in environ.items():
if v == None:
environ[k] = ''
Note that CONTENT_TYPE and CONTENT_LENGTH are being set explicitly, based on
the actual length of the content, and the actual Content-type header.
The proposed change
for title, header in msg.items():
envname = title.replace('-', '_').upper()
envname = "HTTP_" + envname
environ[envname] = header
would simply add an HTTP_ version of *every* header that was passed in the
request, including HTTP_CONTENT_TYPE and HTTP_CONTENT_LENGTH, which aren't
"extra" headers, of course. But I can't think of any case where the presence of
these variables would pose any kind of problem. Their inclusion would be the
only effect of the proposed change; and in return we don't have to run a string
comparison on each header. A minor change to be fair, but why not save the
cycles?
So I think the modification is legitimate. In fact, I'd go an extra step, and
cut out the lines
'HTTP_COOKIE': msg.get('Cookie'),
'HTTP_REFERER': msg.get('Referer'),
'HTTP_USER_AGENT': msg.get('User-agent'),
from the environ dict setup, since these will be re-created by the msg.items()
loop.
Still haven't tested the code, but it seems logical... what do you think? Are
there any cases where the presence of HTTP_CONTENT_LENGTH (in addition to the
presence of CONTENT_LENGTH) would pose a problem?
-- Graham