durusmail: qp: empty script_name for unslashed root requests under cgi
empty script_name for unslashed root requests under cgi
2007-08-09
2007-08-09
2007-08-09
2007-08-09
2007-08-10
2007-08-10
2007-08-10
empty script_name for unslashed root requests under cgi
Mario Ruggier
2007-08-10
> >     # location: '/app.cgi'
> >     SCRIPT_NAME: ''
> >     PATH_INFO: '/app.cgi'
>
> Ugh.  I want Apache + app.cgi to provide
> SCRIPT_NAME: '/app.cgi'
> and
> PATH_INFO: ''
> In this case.

That would make a lot more sense.

> If you use:
>     ScriptAlias /foo 
> do you get the same results?

Yes, for the request '/foo' it gives:
    SCRIPT_NAME: ''
    PATH_INFO: '/foo'


> I think you should not be using SCRIPT_FILENAME at all, since it is not
> part of the CGI 1.1 specification.

My thinking there is that no harm can be done, except that if it is not present
then nothing will change.

> In your case, it looks like this method could be:
>
>      def process (self, stdin, env):
>          if env.get('PATH_INFO') == '':
>              env['PATH_INFO'] = env.get('SCRIPT_NAME')
>              env['SCRIPT_NAME'] = ''
>          hit = Hit(stdin, env)
>          self.process_hit(hit)
>          return hit
>
> This hack also frustrates me though.  The cgi-caller really should
> provide a SCRIPT_NAME that makes sense.

Hmmn, in my case PATH_INFO is never an empty string, so this will never match.
Or did you mean to switch the 2 vars here? I.e. this works in this root case
(but will clearly not in the general case) :

    def process (self, stdin, env):
        if env.get('SCRIPT_NAME') == '':
            env['SCRIPT_NAME'] = env.get('PATH_INFO')
            env['PATH_INFO'] = ''
        hit = Hit(stdin, env)
        self.process_hit(hit)
        return hit

>
> Another idea:
>
> Add this to your root directory:
>
>       def _q_lookup(self, component):
>           redirect('')
>
> Wouldn't this redirect all unknown paths, like '/app.cgi', to '/'?

I tried this, after removing my overriden publisher.process():

    def _q_lookup (self, component):
        r = super(MySiteDirectory, self)._q_lookup(component)
        if r is not None:
            return r
        get_publisher().redirect('')

When requesting '/app.cgi' this gives a redirect loop... I guess the redirect
should be:

        env = get_publisher().get_hit().get_request().environ
        print '%s:%s:%s' % (component, env['SCRIPT_NAME'],
                env.get('PATH_INFO'))
        get_publisher().redirect(env['PATH_INFO']+'/')

Interestingly, when requesting '/app.cgi' the above prints:
app.cgi::/app.cgi
This must simply be because the result of the PATH_INFO is being screwed here.

In the end I am leaning towards overriding process_inputs() instead... as this
will also redirect correctly. The _q_lookup() trick has the advantage of being
executed only when needed, not on every hit, but i find it more convoluted.

    # Handle case when site is deployed under a script, but the requested
    # url ends with precisely the SCRIPT_NAME (with nothing trailing it).
    # This erroneously (on Apache2) gives an empty string SCRIPT_NAME and
    # a PATH_INFO set to the actual script_name -- that will break any
    # dynamically calculated relative url paths.
    # Assumption here is that if a SCRIPT_FILENAME is set, then it must
    # end with the (non-empty) SCRIPT_NAME.
    def process_inputs (self):
        env = self.get_hit().get_request().environ
        if env.get('SCRIPT_FILENAME'):
            if not env.get('SCRIPT_NAME') and env.get('PATH_INFO'):
                if env['SCRIPT_FILENAME'].endswith(env['PATH_INFO']):
                    self.redirect(env['PATH_INFO'] + '/')
        super(Publisher, self).process_inputs()


Yes, it relies on SCRIPT_FILENAME... but it is a hack a fix servers returning an
incorrect SCRIPT_NAME in this special situation. Maybe servers that do not
supply the SCRIPT_FILENAME would anyway return the correct SCRIPT_NAME ?!?

Related question: is the current check in process() still needed? WHich
situations (depoyment scenario, server, os?) trigger it to be executed? The
long comment for it does not seem to indicate that...

mario
reply