[Quixote-users] Re: medusa_http.py patch for basic authentication

Jason Sibre wrote:
> On Thursday, August 19, 2004 4:41 PM Neil Schemenauer wrote:
>
>>Has anyone tried this?  I'm wondering if this is a safe change to
>>make in the Quixote 1.x series.
>>
>
> No!
> Um... This went in one ear and out the other when Graham first posted this,
> but I don't think this is a good idea.
>
> [Disclaimer, I haven't tried the patch.  I've only read the code]
>
> The original code didn't "get rid of" the content-type and content-length
> headers, it just skipped the mangling step of prepending the "HTTP_" to
> them, inserting them as "CONTENT_TYPE" and "CONTENT_LENGTH" (instead of
> "HTTP_CONTENT_TYPE" and "HTTP_CONTENT_LENGTH" which is what the patch will
> cause)...  They aren't supposed to have the HTTP_ prefix.  They're 'real'
> CGI environment variables.  The HTTP_ prefix is something that's done to all
> the 'extra' headers that come across that the CGI spec doesn't specifically
> address.

Hold on... I'm a bit tired, and may be missing something, but I don't think
that's quite what's happening.

Here's a larger snippet of medusa_http.py:

         environ = {'REQUEST_METHOD': request.command,
                    'ACCEPT_ENCODING': msg.get('Accept-encoding'),
                    'CONTENT_TYPE': msg.get('Content-type'),
                    'CONTENT_LENGTH': len(data),
                    "GATEWAY_INTERFACE": "CGI/1.1",
                    'HTTP_COOKIE': msg.get('Cookie'),
                    'HTTP_REFERER': msg.get('Referer'),
                    'HTTP_USER_AGENT': msg.get('User-agent'),
                    'PATH_INFO': path,
                    'QUERY_STRING': query_string,
                    'REMOTE_ADDR': remote_addr,
                    'REMOTE_PORT': str(remote_port),
                    'REQUEST_URI': request.uri,
                    'SCRIPT_NAME': '',
                    "SCRIPT_FILENAME": '',
                    'SERVER_NAME': self.server.ip or socket.gethostname(),
                    'SERVER_PORT': str(self.server.port),
                    'SERVER_PROTOCOL': 'HTTP/1.1',
                    'SERVER_SOFTWARE': self.server_name,
                    }
         # Propagate HTTP headers - copied from twisted_http.py
         for title, header in msg.items():
             envname = title.replace('-', '_').upper()
             if title not in ('content-type', 'content-length'):
                 envname = "HTTP_" + envname
             environ[envname] = header
         for k,v in environ.items():
             if v == None:
                 environ[k] = ''

Note that CONTENT_TYPE and CONTENT_LENGTH are being set explicitly, based on
the actual length of the content, and the actual Content-type header.

The proposed change

         for title, header in msg.items():
             envname = title.replace('-', '_').upper()
             envname = "HTTP_" + envname
             environ[envname] = header

would simply add an HTTP_ version of *every* header that was passed in the
request, including HTTP_CONTENT_TYPE and HTTP_CONTENT_LENGTH, which aren't
"extra" headers, of course. But I can't think of any case where the presence of
these variables would pose any kind of problem. Their inclusion would be the
only effect of the proposed change; and in return we don't have to run a string
comparison on each header. A minor change to be fair, but why not save the
cycles?

So I think the modification is legitimate. In fact, I'd go an extra step, and
cut out the lines

                    'HTTP_COOKIE': msg.get('Cookie'),
                    'HTTP_REFERER': msg.get('Referer'),
                    'HTTP_USER_AGENT': msg.get('User-agent'),

from the environ dict setup, since these will be re-created by the msg.items()
loop.

Still haven't tested the code, but it seems logical... what do you think? Are
there any cases where the presence of HTTP_CONTENT_LENGTH (in addition to the
presence of CONTENT_LENGTH) would pose a problem?

-- Graham