durusmail: quixote-users: Adding magic to PTL, Or: how to stop worrying about XSS holes
Adding magic to PTL, Or: how to stop worrying about XSS holes
Adding magic to PTL, Or: how to stop worrying about XSS holes
2002-10-01
Adding magic to PTL, Or: how to stop worrying about XSS holes
Adding magic to PTL, Or: how to stop worrying about XSS holes
2002-10-02
2002-10-01
2002-10-01
2002-10-01
2002-10-01
Adding magic to PTL, Or: how to stop worrying about XSS holes
Neil Schemenauer
2002-10-01
Greetings gentle Quixote users,

I have come up with a new feature I would like to add to PTL and would
appreciate feedback from the Quixote user community.

Anyone who reads BUGTRAQ or LWN for more than a week realizes that
cross-site scripting (XSS) holes are an entirely too common flaw in web
applications.  The reason is simple: it's nearly impossible to remember
to quote all data that could contain HTML markup.  I'd like to think our
team at the MEMS Exchange is more aware of this problem then the average
web developer yet still we find places in our applications were quoting
is inadequate.  There has to be a better way.

I propose using a new string type for strings containing markup.  Let's
call this type 'Markup'.  The 'Markup' string type behaves similarly to
the 'str' type except for a few key differences.  When concatenating
'str' and 'Markup', the 'str' is first quoted and the result is a
'Markup' string.  When two 'Markup' strings are concatenated, they are
concatenated without quoting.  When the LHS of the % operator is a
'Markup' string, any RHS arguments that will be used for %s, %c or %r
format codes and are not of type 'Markup' will be quoted.  Calling str()
on a 'Markup' string will return its contents as a 'str' (i.e. the
'Markup' type is stripped away).

Here are some examples:

    >>> Markup('%s') % 'Joe & Bob'
    Joe & Bob'>
    >>> Markup('') + '$ echo hello > test.txt' + Markup('')
    $ echo hello > test.txt'>
    >>> str(_)
    '$ echo hello > test.txt'

Now for the magic part.  I propose that the PTL compiler be changed to
make all literal strings in PTL modules of type 'Markup' (or some other
suitable class).  Internally,

    template hello():
        '

hello world

' would become: from quixote.html import HTMLMarkup as _q_markupclass template hello(): _q_markupclass('

hello world

') If you want a literal 'str' object in a PTL module, you would need to surround it by str(). The result of this change would be automatic quoting of data that does not appear as a literal string in a PTL module. That's handy in itself but the real advantage is the change in failure mode. When this system fails it would tend to fail by quoting too much instead of too little. This is a huge security improvement, IMHO. Too much quoting should be immediately visible during testing. Too little quoting is often not visible during testing as most test data does not contain markup. PTL modules could override the literal string type by adding _q_markupclass = someclass at the top of the module. The markup class would have to conform to a simple interface. 'str' would also be a valid markup class that would mean "give me the old PTL behavior". One disadvantage of this proposal is the extra runtime cost. I thought about creating the markup strings at compile time or at module load time but that presents many problems. It's not impossible however and perhaps a later version of PTL would use that approach. I think the runtime cost should be quite small. Markup instances can use the __slots__ declaration, making them quite small and cheap to create. Also, I could reimplement the Markup type and the quote function in C. Personally, I think some speed hit would be worth the convenience and security benefits. Comments? -- Neil Schemenauer | MEMS Exchange Software Engineer | http://www.mems-exchange.org/
reply