Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
Simplest reproducible code is:
pa.read_message(b'')
which gives a segfault.
You can easily run into this interactively when eg by accident passing a already-read buffer to it, like:
serialized = pa.schema([('a', pa.int64())]).serialize().to_pybytes()
buffer = pa.BufferReader(serialized)
pa.read_message(buffer)
pa.read_message(buffer)
And for example, if you compare to read_schema, this gives an error on the second time / empty buffer:
>>> pa.read_schema(buffer)
>>> pa.read_schema(buffer)
...
ArrowInvalid: Tried reading schema message, was null or length 0
I know this is not proper usage of Buffer(Reader), but since it is easy to accidentally do this, we should try to protect users from this I think.
Attachments
Issue Links
- links to