In article ,
Neil Schemenauer writes:
nas> >>>> print htmltext(U"A\xa0B")
nas> > UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in
nas> > position 1: ordinal not in range(128)
nas>
nas> Does this work for you:
nas>
nas> >>> print U"A\xa0B"
nas>
nas> It works for me because:
nas>
nas> >>> import sys
nas> >>> sys.stdout.encoding
nas> 'UTF-8'
nas>
nas> Sometimes stdout is 'ascii' and so you have to manually set the
nas> encoding, eg:
nas>
nas> >>> import sys, codecs
nas> >>> sys.stdout = codecs.getwriter('utf-8')(sys.stdout)
FYI: On UNIX, the environment variables LANG/LC_CTYPE/LC_ALL also
affect Py_FileSystemDefaultEncoding. (I didn't know Python had such
mechanism. Why they don't affect setdefaultencoding?)
% python
>>> import sys, os
>>> sys.stdout.encoding
'EUC-JP'
>>> os.environ['LANG']
'ja_JP.eucJP'
>>> print U"A\xa0B"
Traceback (most recent call last):
File "", line 1, in ?
UnicodeError: EUC-JP encoding error: invalid character \xa0
% env LANG=ja_JP.utf8 python
>>> import sys
>>> sys.stdout.encoding
'UTF-8'
-- kayama