Converting Unicode Codepoints Into Unicode Character Using Python 3.3.1
Solution 1:
Two options:
Choose to interpret it as JSON; that format uses the same escape codes. The input does need to have quotes around it to be seen as a string.
Encode to latin 1 (to preserve bytes), then decode with the
unicode_escape
codec:>>> urllib.parse.unquote(sig).encode('latin1').decode('unicode_escape') '45C482D2486105B02211ED4A0E3163A9F7095E81.4DDB3B3A13C77FE508DCFB7C6CC68957096A406C&type=video/3gpp;+codecs="mp4v.20.3,+mp4a.40.2"&quality=small&itag=17&url=http://r6---sn-cx5h-itql.c.youtube.com/videoplayback?source=youtube&mt=1367776467&expire=1367797699&itag=17&factor=1.25&upn=pkX9erXUHx4&cp=U0hVTFdUVV9OU0NONV9PTllHOnhGdTVLUThqUWJW&key=yt1&id=ab9b0e2f311eaf00&mv=m&newshard=yes&ms=au&ip=49.205.30.138&sparams=algorithm%2Cburst%2Ccp%2Cfactor%2Cid%2Cip%2Cipbits%2Citag%2Csource%2Cupn%2Cexpire&burst=40&algorithm=throttle-factor&ipbits=8&fexp=917000%2C919366%2C916626%2C902533%2C932000%2C932004%2C906383%2C904479%2C901208%2C925714%2C929119%2C931202%2C900821%2C900823%2C912518%2C911416%2C930807%2C919373%2C906836%2C926403%2C900824%2C912711%2C929606%2C910075&sver=3&fallback_host=tc.v19.cache2.c.youtube.com'
This interprets \u
escape codes just like it Python would do when reading string literals in Python source code.
Solution 2:
If I'm guessing right, this is more or less a URL. The '%xx' encodes a single byte outside the allowed character set. The '\uxxxx' encodes a Unicode codepoint. I believe that it is normal for URLs to encode Unicode characters as UTF-8 and then to encode the bytes outside the allowed charset as '%xx' (which affects all multibyte UTF-8 sequences). This makes it surprising that there are '%xx'-encoded bytes already, because translating the Unicode codepoints will make the conversions irreversible.
Make sure you have tests and that you can verify the actual results, because this seems like it was unsafe. At least I don't fully understand the requirements here.
Post a Comment for "Converting Unicode Codepoints Into Unicode Character Using Python 3.3.1"