Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-809

C++: Writing sliced record batch to IPC writes the entire array

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 0.3.0
    • C++
    • None

    Description

      The bug can be triggered through python:

      import pyarrow.parquet
      array = pyarrow.array.from_pylist([1] * 1000000)
      
      rb = pyarrow.RecordBatch.from_arrays([array], ['a'])
      rb2 = rb.slice(0,2)
      
      with open('/tmp/t.arrow', 'wb') as f:
        w = pyarrow.ipc.FileWriter(f, rb.schema)
        w.write_batch(rb2)
        w.close()
      

      which will result in a big file:

      $ ll /tmp/t.arrow 
      -rw-rw-r-- 1 itai itai 800618 Apr 12 13:22 /tmp/t.arrow
      

      Attachments

        Issue Links

          Activity

            People

              wesm Wes McKinney
              itaiin Itai Incze
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: