Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Trying to serialize a large arrow array (around 2**30 entries) I get a non-success status when trying to use the FileReader to read the array:
"Bad status: Invalid: flatbuffer size 0 invalid. File offset: 660, metadata length: 0"
How to reproduce:
Check out the branch arrow-large-objects from https://github.com/pcmoritz/ray-1, and follow http://ray.readthedocs.io/en/latest/install-on-ubuntu.html with that branch.
Then run
python test/jenkins_tests/multi_node_tests/large_memory_test.py
in the ray root directory.
Most likely there is some int32_t somewhere that overflows, but I haven't been able to track it down. The only int32_ts that are used by the FileReader seem to be for the flatbuffer metadata size, which should be small.