durusmail: quixote-users: Why not use UTF-8 by default
Why not use UTF-8 by default
2005-06-04
2005-06-04
2005-06-05
2005-06-05
Why not use UTF-8 by default
Damjan
2005-06-04
I've been browsing the archives of this mail list, and saw a discussion
on the topic "Why not use UTF-8 by default in Quixote 2?"
http://mail.mems-exchange.org/pipermail/quixote-users/2005-March/004355.html

I'd like to add my $0.02 on the issue.

Neil Schemenauer says:
"I suspect that many Quixote users have no use for Unicode at this time."
This may be the other way around, maybe people that need Unicode will not
use Quixote.

Ryan Tomayko says:
"ISO-8859-1 is also the default encoding for the text/html media type
(actually all text/* media types). It's probably not a bad choice of
default."

ISO-8859-1 is a BAD choice, ask anyone that understands the issue and
he'll tell you that the iso-8859-1 default in HTTP doesn't make any
sense. iso-8859-1 is mostly unusable to anyone that's not living in the
ascii world (english speaking mostly).

UTF-8 on the other hand is much better choice, since for one thing it
covers all possible languages, and second if your application (or
framework) works ok with utf-8 it would most likely work with any other
charset there is.

Quixote, as it is now, makes too much (wrong) assumptions about strings
being iso-8859-1, and I've written about it before.

Finally, the point of this mail is not to bich about the current
situation, but I'd like to know how the developers feel about these
issues, what can I expect and where will quixote go next??

My suggestions would be, use unicode strings internally everywhere
(especially htmltext!!?). Convert to quixote.DEFAULT_CHARSET (utf-8 by
default) when doing I/O (stdout/stderr i/o, and network i/o). Be very
carefull when using the str function.


Some references:
http://evanjones.ca/python-utf8.html
http://www.joelonsoftware.com/articles/Unicode.html

--
damjan | дамјан
This is my jabber ID --> damjan@bagra.net.mk <-- not my mail address!!!
reply