Re: [Quixote-users] urllib.quote() and cgi.escape()

On Jan 24, 2006, at 8:21 PM, Titus Brown wrote:

> So, should htmlescape deal with this differently?
>
> right now it does this:
>
>>>> print str(htmlescape("'"))
> '
>>>> print str(htmlescape('"'))
> "

If you were trying to use these characters in a URI value (for their
normal meaning in that context!) then my understanding is that you have
to use their HTML char entities: & < > ". This way, the
HTML document can be valid.

If however you are trying to use them as a string literal value in a
URI context, then you should use the %xx mechanism.

(I was however unable to easily find a clear and convenient statement
of the above in RFC 2396).

Now, in your original question, you were actually trying to use such
characters in the literal value attribute of an input element... as
this value can become a part of the URL for the page (e.g. in the
querystring) than it should follow that it should be escaped with
urllib.quote(), i.e. the %xx mechanism.

So, similar to your original example:
''
%("""contains'different"quotes&stuff""")

and assume some other input field:
''

if we submit the form (or specify the fields in the querystring for the
page) we should end up with a  querystring such as:
?one=contains%27different%22quotes%26stuff&two=normal

Note that the & (as delimeter!) is html escaped as it should be,
but the & as literal value (%26) is url escaped (as it should be?).

But, re your actual question above, I was under the impression that the
"'" character should also be escaped with ' ... but, I see that
this char entity is not even listed in
. So, maybe not.

mario