durusmail: quixote-users: Docstring for html.py
Docstring for html.py
Docstring for html.py
Neil Schemenauer
2002-05-23
On Wed, May 22, 2002 at 06:52:57PM -0400, Neil Schemenauer wrote:
> Try this on for size (written assuming my proposed change of making
> html_quote replace " with "):

I did some more research today and discovered that my explainations were
not entirely accurate.  I say again, what a mess.  I'm not an SGML/XML
guru so Andrew can correct my mistakes.

> html_quote
> ----------
>
> Use for quoting data that will be used within CDATA attribute values or
> as element contents.

The term CDATA should be dropped.  Not all attribute values are declared
as CDATA although for quoting purposes you can treat them that way.

> value_quote
> -----------
[...]
> The & character also replaced with & because some old browsers
> incorrectly interprete entity references inside of CDATA values.

This is not accurate.  Even modern XML processors have to recognize
entity references inside of data declared as CDATA.  This is confusing
because the whole idea of CDATA is that it is not supposed to be parsed.
See section 3.3.3 "Attribute-Value Normalization" of
http://www.xml.com/axml/axml.html for details.

Note that the predefined XML entities are 'lt', 'gt', 'amp', 'apos', and
'quot'.  'apos' is not supported by old browsers so you really can only
use 'lt', 'gt', 'amp' and 'quot'.

> html_quote is required because some old browsers incorrectly interprete
> SGML entity references inside of CDATA attribute values.

Again this is inaccurate.  Modern XML processors must process entities
in CDATA context (so called attribute-value normalization).

  Neil


reply