Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
Small reproducer:
import pyarrow as pa import pyarrow.parquet as pq table = pa.table({'part': [3760212050]*10, 'col': range(10)}) pq.write_to_dataset(table, "test_int64_partition", partition_cols=['part']) In [35]: pq.read_table("test_int64_partition/") ... ArrowInvalid: error parsing '3760212050' as scalar of type int32 In ../src/arrow/scalar.cc, line 333, code: VisitTypeInline(*type_, this) In ../src/arrow/dataset/partition.cc, line 218, code: (_error_or_value26).status() In ../src/arrow/dataset/partition.cc, line 229, code: (_error_or_value27).status() In ../src/arrow/dataset/discovery.cc, line 256, code: (_error_or_value17).status() In [36]: pq.read_table("test_int64_partition/", use_legacy_dataset=True) Out[36]: pyarrow.Table col: int64 part: dictionary<values=int64, indices=int32, ordered=0>
Attachments
Issue Links
- links to