Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
7.0.0
-
None
-
None
Description
IPC files created by the node library `apache-arrow` don't seem to be able to be read by pyarrow. There is an example of this issue here: https://github.com/dancoates/pyarrow-jsarrow-test
writing the arrow file from js
import {tableToIPC, tableFromArrays} from 'apache-arrow'; import fs from 'fs'; const LENGTH = 2000; const rainAmounts = Float32Array.from( { length: LENGTH }, () => Number((Math.random() * 20).toFixed(1))); const rainDates = Array.from( { length: LENGTH }, (_, i) => new Date(Date.now() - 1000 * 60 * 60 * 24 * i)); const rainfall = tableFromArrays({ precipitation: rainAmounts, date: rainDates }); const outputTable = tableToIPC(rainfall); fs.writeFileSync('jsarrow.arrow', outputTable);
reading in python
import pyarrow as pa with open('jsarrow.arrow', 'rb') as f: with pa.ipc.open_file(f) as reader: df = reader.read_pandas() print(df.head())
produces the error:
pyarrow.lib.ArrowInvalid: Not an Arrow file