Unicodeencodeerror: 'ascii' Codec Can't Encode Character U'\xfa' In Position 42: Ordinal Not In Range(128)
Solution 1:
str(x[k]).encode('utf-8')
is the problem.
str(x[k])
will convert a Unicode string to an byte string using the default ascii
codec in Python 2:
>>>x = u'résumé'>>>str(x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 1: ordinal not in range(128)
Non-Unicode values, like booleans, will be converted to byte strings, but then Python will implicitly decode the byte string to a Unicode string before calling .encode()
, because you can only encode Unicode strings. This usually won't cause an error because most non-Unicode objects have an ASCII representation. Here's an example where a custom object returns a non-ASCII str()
representation:
>>>classTest(object):...def__str__(self):...return'r\xc3\xa9sum\xc3\xa9'...>>>x=Test()>>>str(x)
'r\xc3\xa9sum\xc3\xa9'
>>>str(x).encode('utf8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)
Note the above was a decode error instead of an encode error.
If str()
is only there to coerce booleans to a string, coerce it to a Unicode string instead:
unicode(x[k]).encode('utf-8')
Non-Unicode values will be converted to Unicode strings, which can then be correctly encoded, but Unicode strings will remain unchanged, so they will also be encoded correctly.
>>> x = True>>> unicode(x)
u'True'>>> unicode(x).encode('utf8')
'True'>>> x = u'résumé'>>> unicode(x).encode('utf8')
'r\xc3\xa9sum\xc3\xa9'
P.S. Python 3 does not do implicit encode/decode between byte and Unicode strings and makes these errors easier to spot.
Post a Comment for "Unicodeencodeerror: 'ascii' Codec Can't Encode Character U'\xfa' In Position 42: Ordinal Not In Range(128)"