Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-399

[Java] ListVector.loadFieldBuffers ignores the ArrowFieldNode length metadata

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • None
    • 0.2.0
    • Java
    • None

    Description

      Discovered this during integration testing. Because Arrow-C++ writes buffers padded to 64 bytes, they may appear larger to the Java library than they need to be. In ListVector.loadFieldBuffers, the ArrowFieldNode is never used:

        @Override
        public void loadFieldBuffers(ArrowFieldNode fieldNode, List<ArrowBuf> ownBuffers) {
          BaseDataValueVector.load(getFieldInnerVectors(), ownBuffers);
        }
      

      The value count of the resulting ListVector is thus inferred from the size of the offsets buffer. In the case of a length-7 vector in C++, the size of the offsets buffer is exactly 64 bytes (padding for SIMD) – Java infers from 64 bytes that the value count is 15 (64 / 4 - 1), and the integration test fails.

      Attachments

        1. list_error.json
          3 kB
          Wes McKinney

        Issue Links

          Activity

            People

              julienledem Julien Le Dem
              wesm Wes McKinney
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: