Re: [Quixote-users] About _decode_string(s, charset) in http

About _decode_string(s, charset) in http_request.py

2005-08-21

Damjan

About _decode_string(s, charset) in http_request.py

2005-08-21

2005-08-21

2005-08-21

2005-08-24

About _decode_string(s, charset) in http_request.py

Neil Schemenauer

2005-08-21

On Sun, Aug 21, 2005 at 05:19:30PM +0200, Damjan wrote:
> The function _decode_string(s, charset) in http_request.py improperly
> assumes that iso-8859-1 is the default charset.
> But the HTTPRequest class or an instance of that class can have a
> different default charset by changing the DEFAULT_CHARSET member
> variable.
>
> Actually its not very clear what that function tries to do, it seems
> that it will return a unciode object for any other charset, but it will
> return a byte string if the charset is 'iso-8859-1'... this seems to be
> very wrong...
>
> Can someone with more knowledge explain what exactly should tihs
> function perform???

AFAIK, the function works correctly.  Quixote tries to support
applications that use Unicode as well as applications that only use
str objects.  If an application uses only str objects then the
charset of the response must be iso-8859-1.  Unfortunately, other
charsets are not supported (you would have to patch Quixote).  If
you use unicode strings, you can use any charset that Python
supports.

If your application only uses str objects, then Quixote cannot pass
it unicode objects as the application likely cannot handle them and
will end up raising a UnicodeDecodeError exception somewhere.

> In the meanwhile I'll remove the check for iso-8859-1 from
> _decode_string so that it always returns a ".decode"-ed unicode object,
> and see if something breaks.

If your application can handle unicode strings correctly then that
should have no bad effect.

   Neil