Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
9.0.0
-
None
-
OSX 12.6
M1 silicon
Description
A pyarrow Table with only null values cannot be instantiated as a Pandas DataFrame with said column as a category. However, pandas does support "empty" categoricals. Therefore, a simple patch would be to load the pa.Table as an object first and convert, once in pandas, to a categorical which will be empty. However, that does not solve the pyarrow bug at its root.
Sample reproducible example
import pyarrow as pa pylist = [{'x': None, '__index_level_0__': 2}, {'x': None, '__index_level_0__': 3}] tbl = pa.Table.from_pylist(pylist) # Errors df_broken = tbl.to_pandas(categories=["x"]) # Works df_works = tbl.to_pandas() df_works = df_works.astype({"x": "category"})
Attachments
Issue Links
- links to