Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-15642

[Python] [JavaScript] Arrow IPC file output by apache-arrow tableToIPC method cannot be read by pyarrow

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 7.0.0
    • None
    • JavaScript, Python
    • None

    Description

      IPC files created by the node library `apache-arrow` don't seem to be able to be read by pyarrow. There is an example of this issue here: https://github.com/dancoates/pyarrow-jsarrow-test 

       

      writing the arrow file from js

      import {tableToIPC, tableFromArrays} from 'apache-arrow';
      import fs from 'fs';
      
      const LENGTH = 2000;
      const rainAmounts = Float32Array.from(
          { length: LENGTH },
          () => Number((Math.random() * 20).toFixed(1)));
      
      const rainDates = Array.from(
          { length: LENGTH },
          (_, i) => new Date(Date.now() - 1000 * 60 * 60 * 24 * i));
      
      const rainfall = tableFromArrays({
          precipitation: rainAmounts,
          date: rainDates
      });
      
      const outputTable = tableToIPC(rainfall);
      fs.writeFileSync('jsarrow.arrow', outputTable); 

       

      reading in python

      import pyarrow as pa
      with open('jsarrow.arrow', 'rb') as f:
          with pa.ipc.open_file(f) as reader:
              df = reader.read_pandas()
              print(df.head())
      
       

       

      produces the error:

      pyarrow.lib.ArrowInvalid: Not an Arrow file 

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            dancoates Dan Coates
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: