Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-740

FileReader fails for large objects

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.3.0
    • C++
    • None

    Description

      Trying to serialize a large arrow array (around 2**30 entries) I get a non-success status when trying to use the FileReader to read the array:

      "Bad status: Invalid: flatbuffer size 0 invalid. File offset: 660, metadata length: 0"

      How to reproduce:

      Check out the branch arrow-large-objects from https://github.com/pcmoritz/ray-1, and follow http://ray.readthedocs.io/en/latest/install-on-ubuntu.html with that branch.

      Then run
      python test/jenkins_tests/multi_node_tests/large_memory_test.py
      in the ray root directory.

      Most likely there is some int32_t somewhere that overflows, but I haven't been able to track it down. The only int32_ts that are used by the FileReader seem to be for the flatbuffer metadata size, which should be small.

      Attachments

        Activity

          People

            Unassigned Unassigned
            pcmoritz Philipp Moritz
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: