Converting Unicode Sequences To A String In Python 3
In parsing an HTML response to extract data with Python 3.4 on Kubuntu 15.10 in the Bash CLI, using print() I am getting output that looks like this: \u05ea\u05d4 \u05e0\u05e9\u05d
Solution 1:
It appears your input uses backslash as an escape character, you should unescape the text before passing it to json
:
>>>foobar = '{\\"body\\": \\"\\\\u05e9\\"}'>>>import re>>>json_text = re.sub(r'\\(.)', r'\1', foobar) # unescape>>>import json>>>print(json.loads(json_text)['body'])
ש
Don't use 'unicode-escape'
encoding on JSON text; it may produce different results:
>>>import json>>>json_text = '["\\ud83d\\ude02"]'>>>json.loads(json_text)
['😂']
>>>json_text.encode('ascii', 'strict').decode('unicode-escape') #XXX don't do it
'["\ud83d\ude02"]'
'😂' == '\U0001F602'
is U+1F602 (FACE WITH TEARS OF JOY).
Post a Comment for "Converting Unicode Sequences To A String In Python 3"