Identical Tuples Give Different Pickles
The situation is pretty strange and I expect that there is something that I don't know about the pickle module. I have two tuples s1 and s2. If I compare them it returns True. s1 =
Solution 1:
The problem is that pickle
preserves identity relationships between subobjects (this is for efficiency and to handle recursive objects):
a1 = "a" * 30
a2 = "a" * 30
s1 = (a1, a2, a1)
s2 = (a1, a2, a2)
print(pickle.dumps(s1) == pickle.dumps(s2))
False
Here, a1
and a2
are equal but are different objects (they have a different id
; see When does python choose to intern a string); s1
and s2
are again equal but in s1
the third element is the same object as the first element, while in s2
the third element is the same object as the second element.
We can see this by disassembling the pickles:
>>>pickletools.dis(pickle.dumps(s1))0:\x80PROTO32:XBINUNICODE'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'37:qBINPUT039:XBINUNICODE'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'74:qBINPUT176:hBINGET0# <=== watch this line *********************78:\x87TUPLE379:qBINPUT281:.STOPhighestprotocolamongopcodes=2>>>pickletools.dis(pickle.dumps(s2))0:\x80PROTO32:XBINUNICODE'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'37:qBINPUT039:XBINUNICODE'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'74:qBINPUT176:hBINGET1# <=== this is different *********************78:\x87TUPLE379:qBINPUT281:.STOPhighestprotocolamongopcodes=2
A quick (if inefficient) workaround could be to break the object identities by using ast.literal_eval
to recreate the tuples from their repr
:
>>> print(pickle.dumps(ast.literal_eval(repr(s1))) == pickle.dumps(ast.literal_eval(repr(s2))))
True
Post a Comment for "Identical Tuples Give Different Pickles"