Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.9.0
Description
When writing a table to parquet through pandas, if any column includes an empty list, it fails with a segmentation fault.
Minimal example:
import pyarrow as pa import pyarrow.parquet as pq import pandas as pd def save(rows): table1 = pa.Table.from_pandas(pd.DataFrame(rows)) pq.write_table(table1, 'test-foo.pq') table2 = pq.read_table('test-foo.pq') print('ROWS:', rows) print('TABLE1:', table1.to_pandas(), sep='\n') print('TABLE2:', table2.to_pandas(), sep='\n') save([{'val': ['something']}]) print('---') save([{'val': []}]) # empty
Output:
ROWS: [{'val': ['something']}] TABLE1: val 0 [something] TABLE2: val 0 [something] --- ROWS: [{'val': []}] TABLE1: val 0 [] [1] 13472 segmentation fault (core dumped) python3 test.py
Versions:
$ pip3 list | grep pyarrow pyarrow (0.9.0) $ python3 --version Python 3.5.2
Attachments
Issue Links
- is caused by
-
PARQUET-1268 [C++] Conversion of Arrow null list columns fails
- Resolved
- links to