Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
In the fastbinary python deserializer, allocating a list is done like so:
len = readI32(input); if (!check_ssize_t_32(len)) { return NULL; } ret = PyList_New(len);
The only validation of len is that it's under INT_MAX. I've encountered a situation where upon receiving garbage input, and having len be read as something like 1 billion, the library treated this as a valid input, allocated gigs of RAM, and caused a server to crash.
The quick fix I made was to limit list sizes to a sane value of a few thousands that more than suits my personal needs.
But IMO this should be dealt with properly. One way that comes to mind is not pre-allocating the entire list in advance in case it's really big, and resizing it in smaller steps while reading the input.
Attachments
Issue Links
- breaks
-
IMPALA-10816 'NoneType' object is not iterable exception when query returns more than 10.000 rows
- Open
- is superceded by
-
THRIFT-3532 Add configurable string and container read size limit to Python protocols
- Closed