Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.12.0
Description
On pyarrow 0.12.0 some (but not all) columns cannot be read as category dtype. Attached is an extracted failing sample.
import dask.dataframe as dd df = dd.read_parquet('slug.pq', categories=['slug'], engine='pyarrow').compute() print(len(df['slug'].dtype.categories))
This works on pyarrow 0.11.1 (and fastparquet 0.2.1).
Attachments
Attachments
Issue Links
- is blocked by
-
ARROW-4872 [Python] Keep backward compatibility for ParquetDatasetPiece
- Resolved