On Fri, Jul 11, 2003 at 01:09:35PM +0400, Oleg Broytmann wrote: > On Fri, Jul 11, 2003 at 10:32:25AM +0200, Bud P. Bruegger wrote: > > But the functions you pointed me to don't affect accented characters, > > umlauts, etc. > > These functions are for *html* quoting. If you need *URL* quoting you > need urllib.quote, urllib.quote_plus or quixote.html.url_quote. I think the OP is looking for something that will turn 'é' into 'é' or 'é', for instance, for use where UTF-8 isn't supported (and I'd suggest trying UTF-8 first!). Otherwise, something like: def escape_to_entities(string): string = string.replace('&', '&') string = string.replace('<', '<') string = string.replace('>', '>') string = string.replace('"', '"') result = [] for s in string: if ord(s) > 0x7f: s = '%d;'%ord(s) result.append(s) return ''.join(result) Alternatively, with python 2.3, if you want to map to named entities: import htmlentitydefs codepoint2entity = {} for c in htmlentitydefs.codepoint2name: codepoint2entity[c] = '&%s;'%unicode(htmlentitydefs.codepoint2name[c]) def escape_to_entities(string): ustr = unicode(string).translate(codepoint2entity) result = [] for s in ustr: if ord(s) > 0x7f: s = '%d;'%ord(s) result.append(s) return ''.join(result) (This should also work in 2.2 if you use 2.3's htmlentitydefs) 'course, this would be a bit slow. Hmm, maybe there should be a codec so you could do u'\u00e9'.encode('html-text'). -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm@physics.mcmaster.ca