durusmail: quixote-users: Default page encoding?
Default page encoding?
2005-01-26
Default page encoding?
Kevin Dangoor
2005-02-01
Hi Neil,

Sorry for arriving a little late to this exchange that I started...

Neil Schemenauer wrote:

>On Wed, Jan 26, 2005 at 08:48:43AM -0500, Kevin Dangoor wrote:
>
>
>>If there *is* currently a way to set the default, that would be nice to
>>know.
>>
>>
>
>There is currently no way to change the default.  You can do
>something like this:
>
>    class RootDirectory(Directory):
>        def _q_traverse(self, path):
>            get_response().set_charset('utf-8')
>            return super(RootDirectory, self)._q_traverse(path)
>
>
That's a pretty good solution, because it lets you set it on a
per-Directory basis. Of course, when *everything* fits a certain
encoding scheme, you next solution seems better...

>We could add a DEFAULT_CHARSET attribute to HTTPResponse (see the
>attached patch).  There is still a problem with set_content_type(),
>I think.  set_content_type() is often used when serving up binary
>data.  In that case I don't think you want to default charset to be
>UTF-8, even though you probably want text/* responses to be UTF-8.
>We could change set_content_type() to this:
>
>    def set_content_type(self, content_type, charset=None):
>        self.content_type = content_type
>        if charset is None:
>            if content_type.startswith('text/'):
>                charset = self.DEFAULT_CHARSET
>            else:
>                charset = 'iso-8859-1'
>        self.charset = charset
>
>
I don't think you need to go through all of that. I would be surprised
if the browser paid attention to the "charset" when looking at a stream
of bytes for something. The charset should only matter if the browser is
actually rendering some text. As a test, I just telnetted to my server
and looked at a jpg that was served up by a StaticDirectory. Not
surprisingly, here's what the header said:
Content-type: image/jpeg; charset=iso-8859-1
and the browser did not choke on the image at all.

As Oleg points out in another message, some charsets may prove to be a
problem with some types of files (like CSS). Offering a DEFAULT_CHARSET,
as you do with your patch, doesn't remove the ability to set the charset
individually as needed. And the patch also doesn't change the default
default, so no backwards compatibility is broken.

I'm +1 on the patch.

Kevin

reply