Unicodeencodeerror: 'ascii' Codec Can't Encode Character U'\xfa' In Position 42: Ordinal Not In Range(128)

February 19, 2024 Post a Comment

def main(): client = ##client_here db = client.brazil rio_bus = client.tweets result_cursor = db.tweets.find() first = result_cursor[0] ordered_fieldnames =

Solution 1:

str(x[k]).encode('utf-8') is the problem.

str(x[k]) will convert a Unicode string to an byte string using the default ascii codec in Python 2:

>>>x = u'résumé'>>>str(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 1: ordinal not in range(128)

Non-Unicode values, like booleans, will be converted to byte strings, but then Python will implicitly decode the byte string to a Unicode string before calling .encode(), because you can only encode Unicode strings. This usually won't cause an error because most non-Unicode objects have an ASCII representation. Here's an example where a custom object returns a non-ASCII str() representation:

>>>classTest(object):...def__str__(self):...return'r\xc3\xa9sum\xc3\xa9'...>>>x=Test()>>>str(x)
'r\xc3\xa9sum\xc3\xa9'
>>>str(x).encode('utf8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)

Note the above was a decode error instead of an encode error.

If str() is only there to coerce booleans to a string, coerce it to a Unicode string instead:

Baca Juga

unicode(x[k]).encode('utf-8')

Non-Unicode values will be converted to Unicode strings, which can then be correctly encoded, but Unicode strings will remain unchanged, so they will also be encoded correctly.

>>> x = True>>> unicode(x)
u'True'>>> unicode(x).encode('utf8')
'True'>>> x = u'résumé'>>> unicode(x).encode('utf8')
'r\xc3\xa9sum\xc3\xa9'

P.S. Python 3 does not do implicit encode/decode between byte and Unicode strings and makes these errors easier to spot.

Python Playground

Unicodeencodeerror: 'ascii' Codec Can't Encode Character U'\xfa' In Position 42: Ordinal Not In Range(128)

Solution 1:

Post a Comment for "Unicodeencodeerror: 'ascii' Codec Can't Encode Character U'\xfa' In Position 42: Ordinal Not In Range(128)"