durusmail: qp: empty script_name for unslashed root requests under cgi
empty script_name for unslashed root requests under cgi
2007-08-09
2007-08-09
2007-08-09
2007-08-09
2007-08-10
2007-08-10
2007-08-10
empty script_name for unslashed root requests under cgi
mario ruggier
2007-08-09
Hi,

in a qp site deployed behind a cgi script, such that for example the
root url for the site is something like:

        http://domain.some/app.cgi/

Requesting it like that will give correct values for script_name and
script_filename, i.e.:

        SCRIPT_NAME = '/app.cgi'
        SCRIPT_FILENAME = ${htdocs root path} + '/app.cgi'

However, requesting the same but without the trailing slash, i.e.

        http://domain.some/app.cgi

results in an incorrect empty script_name, but script_filename stays
correct.

        SCRIPT_NAME = ''
        SCRIPT_FILENAME = ${htdocs root path} + '/app.cgi'

The request is dispatched to the listening qp application, as per the
cgi's config, but a 404 is returned, and not the SiteDirectory's index
(it never gets that far). Seeing the 404 for root of the site is
already surprising behaviour -- after all if the site was deployed as
e.g. a virtual host, there would be no surprise for what
http://domain.some will return. In addition, there is the problem that
any dynamic paths for support files  that may need to be calculated to
serve the 404 page (requiring the correct value of script_name) will be
broken.

I was wondering where should this check be handled?

In SCGI itself? Is it correct to forward an "empty" request? Should
SCGI know to add a slash for such "empty root requests?

Or should QP be checking for this? In my app I have added the following
to workaround logic to my _q_traverse:

         if len(path)==1: # i.e. if we are "done" with the traversal...
             environ = get_request().environ
             sfn = environ.get('SCRIPT_FILENAME')
             if sfn:
                 sn = environ.get('SCRIPT_NAME')
                 if not (sn and sfn.endswith(sn)):
                     get_publisher().redirect(get_request().get_url() +
'/')

If SCRIPT_FILENAME is None (or empty) it means that we are not deployed
behind a script, so problem cannot occur in this case. If not, then
assume (?) SCRIPT_NAME must be validly set, and must also match the end
of the SCRIPT_FILENAME... then redirect to the site's "/".

This check is only needed for requests for the site's root. All
requests for lower URLs invariably have the script_name correctly set.
Is there a simpler way for this check? Is it QP that should be adding
the slash? Are there situations when adding the slash here would not be
the correct thing?

mario

reply