Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Not A Problem
-
0.15.0, 0.15.1
-
None
-
None
-
Broken PyArrow: conda-forge build 0.15.0-py37h8b68381_0
Working PyArrow: conda-forge build 0.14.1-py37h8b68381_2
Arrow JS versions tested: 0.13.0, 0.14.0 (apache-arrow@latest)
OS: Tested on Win7, Amazon Linux AMI 2018.03
Python platforms: Win7/Conda/Python 3.6, RHEL/Conda/Python 3.6
JS platforms: Node 10.15.0, Chrome 77.0.3865.120Broken PyArrow: conda-forge build 0.15.0-py37h8b68381_0 Working PyArrow: conda-forge build 0.14.1-py37h8b68381_2 Arrow JS versions tested: 0.13.0, 0.14.0 ( apache-arrow@latest ) OS: Tested on Win7, Amazon Linux AMI 2018.03 Python platforms: Win7/Conda/Python 3.6, RHEL/Conda/Python 3.6 JS platforms: Node 10.15.0, Chrome 77.0.3865.120
Description
Originally raised by Sarath on StackOverflow, reporting here as I've run into this issue as well.
When exporting an Arrow table using PyArrow, ArrowJS incorrectly imports it as a 0-row table, skipping any data in the table. The schema is imported correctly, including metadata, but the length of the table is 0.
import pyarrow as pa table = pa.Table.from_pydict( {"a": [1, 2, 3], "b": [4, 5, 6]} ) with pa.RecordBatchFileWriter('file.arrow', table.schema) as writer: writer.write_table(table)
If file.arrow was generated with PyArrow 0.15, the following JS snippet will fail. However, if you generated it with PyArrow 0.14, then the JS snippet will work as expected:
const { readFileSync } = require("fs"); const { Table } = require("apache-arrow"); const data = readFileSync("file.arrow"); const table = Table.from([ data ]); console.assert(table.length === 3, "Table should have 3 rows"); console.assert(table.get(0) != null, "First row should not be null");
Tested with PyArrow 0.14.1, 0.15.0, and ArrowJS 0.13.0 and 0.14.1.