durusmail: quixote-users: [PATCH] mod_python_handler fix (upstreamable?)
[PATCH] mod_python_handler fix (upstreamable?)
[PATCH] mod_python_handler fix (upstreamable?)
2004-08-12
2004-08-13
2004-08-13
John Belmonte (2 parts)
2004-08-13
[PATCH] mod_python_handler fix (upstreamable?)
Shahms King
2004-08-13
On Thu, 2004-08-12 at 17:36 -0400, John Belmonte wrote:
> Neil Schemenauer wrote:
> > Can anyone who uses mod_python comment on this patch?  Is it
> > appropriate to include in the 1.x series of Quixote?
>
> I'm a little wary of anything messing with SCRIPT_NAME/PATH_INFO.  Is it
> that mod_python or Apache are doing something wrong?  Perhaps Shahms can
> provide some examples showing input environment, PythonOptions, and
> output environment.
>
> I've had success deriving the real script and path from REQUEST_URI,
> which isn't part of CGI/1.1 per se, but seems to be common.  It is
> effective even in the face of URL rewriting.  I fall back to SCRIPT_NAME
> if it doesn't exist.
>
> -John

It's been a little while, but if I remember correctly, SCRIPT_NAME is
always one path element beyond the last existing directory for a given
URL and PATH_INFO is the rest of it.  This isn't wrong per se, but can
lead to some strange interactions between Quixote and mod_python. This
isn't comprehensive (in fact, I know I'm leaving out some situations
that I don't remember) but the two simplest examples, assuming a
directory structure of:

/var/www/quixoteApplication/
/var/www/quixoteApplication/images

And an Apache config with:
Directory "/var/www/quixoteApplication" -> mod_python_publisher
Alias /quixoteApplication /var/www/quixoteApplication

A request for "/quixoteApplication/" will have a SCRIPT_NAME of "/
quixoteApplication/" and a PATH_INFO of "" (which leads to infinite
redirection with "fix trailing slash" on, but is trivial fixable).

A request for "/quixoteApplication/images/image.png" will have a
SCRIPT_NAME of "/quixoteApplication/images/image.png" and, again, a
PATH_INFO of "".

The you end up in a similar situation if you alias (or LocationMatch)
the Quixote application to a path that is more than one directory beyond
the last existing directory.  Say, for the same structure you instead

Alias /some/long/url/with/extra/path/components /var/www/quixoteApplication

Now, every request for
"/some/long/url/with/extra/path/components/images/image.png" has all of
that in the PATH_INFO.  Odds are pretty good that you don't really want
Quixote handling the entire path, but only the "/images/image.png" part.

You can see why these scenario would cause some problems.  I'd venture
that determining the real SCRIPT_NAME and PATH_INFO from just the
environment variables given by Apache is not possible.  It's trivial to
fix the pathological infinite redirection case, but determining the
actual SCRIPT_NAME and PATH_INFO for every case is not.  mod_python
passes a lot of information in the request, including the entire Apache
configuration as nested lists.  Using a little creative parsing of this
tree and possibly some stats, it should be possible to determine the
correct SCRIPT_NAME and PATH_INFO entirely algorithmically.  It would,
however, be very expensive to do so on a per-request basis.

I agree that messing with SCRIPT_NAME and PATH_INFO is unpleasant and
I'd like a better solution, I'm just not sure one is possible given
current mod_python and Apache.

--
Shahms E. King 
Multnomah ESD

Public Key:
http://shahms.mesd.k12.or.us/~sking/shahms.asc
Fingerprint:
1612 054B CE92 8770 F1EA  AB1B FEAB 3636 45B2 D75B
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQBBHNbl/qs2NkWy11sRAmnNAKDM2WOuXjmFu/KCjR4jGC4eQynmwgCeKl7e
j23S9PonKQhjk7QID2DXroI=
=8RZd
-----END PGP SIGNATURE-----
reply