Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
0.8.0
-
OS: Mac OS X 10.13.2
Python: 3.6.4
PyArrow: 0.8.0
Description
If you try to write a PyArrow table containing nanosecond-resolution timestamps to Parquet using `coerce_timestamps` and `use_deprecated_int96_timestamps=True`, the Arrow library will segfault.
The crash doesn't happen if you don't coerce the timestamp resolution or if you don't use 96-bit timestamps.
To Reproduce:
import datetime import pyarrow from pyarrow import parquet schema = pyarrow.schema([ pyarrow.field('last_updated', pyarrow.timestamp('ns')), ]) data = [ pyarrow.array([datetime.datetime.now()], pyarrow.timestamp('ns')), ] table = pyarrow.Table.from_arrays(data, ['last_updated']) with open('test_file.parquet', 'wb') as fdesc: parquet.write_table(table, fdesc, coerce_timestamps='us', # 'ms' works too use_deprecated_int96_timestamps=True)
See attached file for the crash report.
Attachments
Attachments
Issue Links
- Blocked
-
PARQUET-1274 [Python] SegFault in pyarrow.parquet.write_table with specific options
- Resolved