Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.16.0
-
None
-
Windows 7
python 3.6.9
pyarrow 0.16 from conda-forge
Description
When I installed pyarrow 0.16, some parquet files created with pyarrow 0.15.1 would make python crash. I drilled down to the simplest example I could find.
It happens that some parquet files created with pyarrow 0.16 cannot either be read back. The example below works fine with arrays_ok but python crashes with arrays_nok (and as soon as they are at least three different values apparently).
Besides, it works fine with 'none', 'gzip' and 'brotli' compression. The problem seems to happen only with snappy.
import pyarrow.parquet as pq import pyarrow as pa arrays_ok = [[0,1]] arrays_ok = [[0,1,1]] arrays_nok = [[0,1,2]] table = pa.Table.from_arrays(arrays_nok,names=['a']) pq.write_table(table,'foo.parquet',compression='snappy') pq.read_table('foo.parquet')
Attachments
Issue Links
- is duplicated by
-
ARROW-9229 [Python] Pyarrow.Parquet.read_table Silently Crashes Python
-
- Closed
-
- is related to
-
ARROW-9114 [C++][Packaging] Illegal instruction crash in arrow.dll
-
- Closed
-
- relates to
-
ARROW-9229 [Python] Pyarrow.Parquet.read_table Silently Crashes Python
-
- Closed
-