Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
0.12.1
-
None
Description
The index of a dataframe is no longer reconstructed when using empty column selection. This is a regression to 0.12.1 and probably only happens for pd.RangeIndex
import pandas as pd import pyarrow as pa import pyarrow.parquet as pq from kartothek.serialization import ParquetSerializer from storefact import get_store_from_url print(pa.__version__) df = pd.DataFrame( {"a": [1, 2]} ) print(df.index) table = pa.Table.from_pandas(df) buf = pa.BufferOutputStream() pq.write_table(table, buf) reader = pa.BufferReader(buf.getvalue().to_pybytes()) table_restored = pq.read_pandas(reader, columns=[]) df_restored = table_restored.to_pandas() print(len(df_restored))
Attachments
Issue Links
- is related to
-
ARROW-5427 [Python] RangeIndex serialization change implications
- Resolved