-
-
Notifications
You must be signed in to change notification settings - Fork 32.3k
Pickle BINSTRING
incorrect data type for size #135321
Description
Bug report
Bug description:
The BINSTRING
opcode in Pickle has two arguments: a "4-byte little-endian signed int" length, and a string of that many bytes (see the code comment for BINSTRING
in pickletools
). Since it is signed, this means that any provided value over 0x7fffffff
would be interpreted as a negative number. The Python pickle implementation specifically treats it as a signed 32-bit length and checks to see if the length is < 0:
Lines 1454 to 1458 in 754e7c9
def load_binstring(self): | |
# Deprecated BINSTRING uses signed 32-bit length | |
len, = unpack('<i', self.read(4)) | |
if len < 0: | |
raise UnpicklingError("BINSTRING pickle has negative byte count") |
However, the C pickle implementation runs calc_binsize(s, 4)
for BINSTRING
and returns a Py_ssize_t
. Since Py_ssize_t
data types are the same size as the compiler's size_t
type (PEP0353), this means a Py_ssize_t
is 64-bits long on 64-bit systems. Since the size
variable here is also a Py_ssize_t
, that means the threshold for negative values is much higher.
Lines 5546 to 5558 in a58026a
Py_ssize_t size; | |
char *s; | |
if (_Unpickler_Read(self, st, &s, nbytes) < 0) | |
return -1; | |
size = calc_binsize(s, nbytes); | |
if (size < 0) { | |
PyErr_Format(st->UnpicklingError, | |
"BINSTRING exceeds system's maximum size of %zd bytes", | |
PY_SSIZE_T_MAX); | |
return -1; | |
} |
This is all just the background to illustrate that because size
is not an int
, a pickle with the BINSTRING
opcode using a length > 0x7fffffff will fail in the Python implementation (since it's negative), but deserialize just fine in the C implementation.
The following payload will demonstrate the difference:
payload: b'T\x00\x00\x00\x80'+b'a'*0x80000000 + b'.'
pickle: FAILURE BINSTRING pickle has negative byte count
_pickle.c: aaaaaaaaaaaaaaaaaaa....
pickletools: pickletools failed string4 byte count < 0: -2147483648
The required payload does need to be 2GB in size which is very large, but not impossible on modern systems.
Note that the LONG4
opcode is in a similar situation, except the output for calc_binint()
is an int
data type so this issue does not exist there.
CPython versions tested on:
CPython main branch
Operating systems tested on:
Linux
Linked PRs
- gh-135321: Changing data type of
size
variable forBINSTRING
in_pickle
#135322 - [3.14] gh-135321: Always raise a correct exception for BINSTRING argument > 0x7fffffff in pickle (GH-135322) #135382
- [3.13] gh-135321: Always raise a correct exception for BINSTRING argument > 0x7fffffff in pickle (GH-135322) #135383
Metadata
Metadata
Assignees
Labels
Status
Milestone
Relationships
Development
Issue actions