Skip to content Skip to sidebar Skip to footer

Python 2.7 Encoding Decoding

I have a problem involving encoding/decoding. I read text from file and compare it with text from database (Postgres) Compare is done within two lists from file i get 'jo\x9a' for

Solution 1:

The first one seems to be windows-1250, and the second is utf-8.

>>>print'jo\x9a'.decode('windows-1250')
još
>>>print'jo\xc5\xa1'.decode('utf-8')
još
>>>'jo\x9a'.decode('windows-1250') == 'jo\xc5\xa1'.decode('utf-8')
True

Solution 2:

Your file strings seems to be Windows-1250 encoded. Your database seems to contain UTF-8 strings.

So you can either convert first all strings to unicode:

codes_from_file = [a.decode("windows-1250") forain codes_from_file]
kode_prfoksov]  = [a.decode("utf-8") forain codes_from_file]

or if you do not want unicode strings, just convert the file string to UTF-8:

codes_from_file = [a.decode("windows-1250").encode("utf-8") for a in codes_from_file]

Post a Comment for "Python 2.7 Encoding Decoding"