durusmail: quixote-users: Re: Medusa bug or user misunderstanding?
Medusa bug or user misunderstanding?
2004-01-21
2004-01-21
2004-01-21
Re: Medusa bug or user misunderstanding?
2004-01-21
2004-01-21
2004-01-21
Medusa bug or user misunderstanding?
2004-01-21
2004-01-22
Re: Medusa bug or user misunderstanding? [patch files]
2004-01-22
Jason E. Sibre (3 parts)
Re: Medusa bug or user misunderstanding? [patchfiles]
2004-02-04
2004-02-05
Medusa bug or user misunderstanding?
2004-01-22
2004-01-21
Re: Medusa bug or user misunderstanding?
Jason E. Sibre
2004-01-22
Ugh...

Run off for a few hours, and this is the hornet's nest I return to??? ;)  So
many replies, and with so much valuable insight!

Well, in my simple case, something like Andrew's original solution seems to
work well, but it sure sparked a lot of debate.

On the one hand, I feel as Graham must have, when he wrote:
> I've concluded that I don't like the CGI spec any more, and put my trust
>   in de-facto standards. Namely: what does Apache/FCGI do?

After all, if IIS and Apache agree on something, that pretty much defines
the bulk of the environment.  Even if everyone else ganged up and did the
same thing, it'd be a smallish gang.  Of course, in Quixote's case, which is
what we're really talking about here, I think IIS in not important at all.

Does anyone use IIS with Quixote?

I guess what we have to think about is: What are we trying to do with that
request.get_server() ? After a brief search of the source code of Qx, the
only place I see it being used internally, is in get_url() which returns
"scheme://get_server/get_path"....  Quite suitable for redirects, and
"self-referencing URLs" (if I may nod at the cgi spec... btw, to point out
the obvious, DNS aliases don't HAVE to be the same as the machine's actual
name, so the cgi spec isn't really very informative on SERVER_NAME.  It's
almost contradictory.)  I promise to stop ranting. Eventually.

So, does everyone agree on the purpose of get_server()?  That it's there to
facilitate generating URLs to redirect/link to? (As opposed to providing
authoritative info about the server's really real name, or anything like
that?)

With this in mind, I think the Host header is the BEST place to go for the
info.  It addresses virtual hosting, proxying (as per Graham's test
scenario), and conveniently, provides the 'right' port info (nodding back to
Graham's test scenario).  If the proxy is pointing to 443 on the real
server, you can always find other ways to verify the SSL is on.

So, we can use Host, right?

Well, that does leave two rubs...  HTTP/1.1, and Non-Compliant Browsers...
I'm doing all my tinkering today with Mozilla, but really, the Host header,
comes from the client, therefore it's up to the client to send the right
info.  Then again, if the client sabotages us....  Well, I guess we all
gotta live in the same world, if that means putting in a reasonable effort
to accomodate broken clients, so be it.  Far as I can tell, Host is not part
of HTTP/1.0, but some 1.0 requests include it anyway (corrections, anyone?).
If we get a request without a Host header, we gotta fall back, somehow.  Jim
made the very interesting suggestion of:
"""
A fallback, in case for some reason HTTP_HOST is missing, would be to
use HTTP_REFERER as I did in the app_redirect() function shown in my
earlier posting on this subject.
"""

I think that would work fairly well, most of the time, but I'm not sure it's
the ideal solution because Referer will be TOTALLY bogus (for our purposes)
if the Referer was another site.  In apps like we tend to write, thats
probably a very small precentage of the requests we get, but, oy!  What a
headache troubleshooting would become!

If we have to failover (because Host is missing, for whatever reason), we
really have no other option but to use what the server provides.  I did a
test simulating a 'Host'less HTTP/1.0 request via telnet, and Apache
provided my machines 'real' FQDN (eg, "thishost.example.com"), which was so
configured by the httpd.conf ServerName directive (or maybe it figured it
out through some other means).  Medusa (or rather, our current handler),
provides the bare host name (eg, "thishost").  That could be modified to act
as medusa does (internally) and Apache does, by changing the
"socket.gethostname()" in

                   'SERVER_NAME': self.server.ip or socket.gethostname(),

to "socket.gethostbyaddr(socket.gethostname())" Yes, it's long and wordy,
but it works.  On my box... Would it work without a functional DNS?  I don't
know.  I don't think that answer matters in this context.  This is
essentially what Medusa does internally.  This may also incur more overhead
on a per request basis than is comfortable, so maybe that should be figured
out at start up, and stored in QuixoteHandler.server.ip, if it's blank.
Then again, I guess it could be done in the driver script, and passed in
when constructing the http_server.  I do think it would be nicer if the
QuixoteHandler did it, though.  It's a sensible default behavior, IMHO.

That leaves another problem.

get_server() uses the SERVER_PORT to reconstruct the port info (if it's
needed)...  In the spirit of what we're talking about (providing an
effective get_server()), that may well mean populating SERVER_PORT from the
Host header, if it's there...  Or using something other than SERVER_PORT to
build get_server().  I personally lean toward the latter.  Leave SERVER_PORT
intact, and build get_server() from the original Host header if there (it'll
include the port if it's needed), and if it's not there, use SERVER_NAME and
SERVER_PORT to construct it?  Port may be wrong (in the case of proxies),
but if there's no Host header, I honestly don't think there's any way for us
to salvage it.  Unless the proxy provides us with additional headers.  I
don't have a proxy set up, and don't wanna learn that one tonight.  (I've
still got PTL issues to deal with!)


Comments?  Agreement, maybe?  (I mean on something other than me writing too
much!)

Sorry this was so drawn out, but I wanted to try to bring all our thoughts
together, even though I've probably missed a bit.

Jason







> -----Original Message-----
> From: quixote-users-bounces+jsibre=chironsys.com@mems-exchange.org
> [mailto:quixote-users-bounces+jsibre=chironsys.com@mems-exchange.org]On
> Behalf Of Graham Fawcett
> Sent: Wednesday, January 21, 2004 2:45 PM
> To: quixote-users@mems-exchange.org
> Subject: [Quixote-users] Re: Medusa bug or user misunderstanding?
>
>
> xA.M. Kuchling wrote:
>
> > On Wed, Jan 21, 2004 at 02:35:39PM -0500, Graham Fawcett wrote:
> >
> >>Andrew's suggestion looks like a winner: it gives a nice failover in
> >>case 'Host' is not included in the request header.
> >
> >
> > One corner case: I'm not sure what happens if a port is specified.  Does
> > Host: contain "servername:8080"?  Is it OK to include the port in the
> > SERVER_NAME variable?
>
> D'oh. You're right about the Host header: it includes the port. If the
> port is not specified, the default port for the service is implied.
>
> After a re-read of the CGI spec, it turns out that SERVER_NAME is "not
> request-specific and [is] set for all requests". So, according to
> CGI/1.1, SERVER_NAME should not be equivalent to the Host header.
>
> I've concluded that I don't like the CGI spec any more, and put my trust
>   in de-facto standards. Namely: what does Apache/FCGI do?
>
> I set up a quick proxy at machine1:8080 which passes requests to an
> Apache server at machine2:80. Names have been changed to protect the
> innocent. Not a real HTTP proxy, BTW, just used Sam Rushing's
> proxy_server class from his Medusa writings.
>
> Whipped up a quick PHP page (ugh, I know, but I don't have Python on
> that machine):
>
>  echo "

HTTP_HOST: ". $_SERVER["HTTP_HOST"]; > echo "

SERVER_NAME: ". $_SERVER["SERVER_NAME"]; > echo "

SERVER_PORT: ". $_SERVER["SERVER_PORT"]; > ?> > > The results were interesting. Hitting my proxy at > > http://machine1:8080/graham/test.php > > machine2 returned these results: > > HTTP_HOST: machine1:8080 > SERVER_NAME: machine1 > SERVER_PORT: 80 > > I assume that the PHP interpreter did not fiddle with the results. > > The SERVER_PORT is the port that Apache is listening on, not the port > from my Host header. Interesting, but to be fair I wasn't using a real > HTTP proxy; maybe it would have responded differently... but maybe not. > > In conclusion, Apache/FGCI uses the hostname in HTTP_HOST to set the > SERVER_NAME variable, and thus is in flagrant violation of the > "non-request-specific" directive in the CGI spec. ;-) Futhermore, it > does not include the port in SERVER_NAME. QED. > > I'd suggest that http_request.py be changed to use HTTP_HOST instead of > SERVER_NAME for any redirects, if HTTP_HOST is present. > > -- Graham > > > _______________________________________________ > Quixote-users mailing list > Quixote-users@mems-exchange.org > http://mail.mems-exchange.org/mailman/listinfo/quixote-users >

reply