Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
0.14.1
-
None
-
Tested with 0.14.1 and 0.14.0.RAY from pip3 on ubuntu
Description
When initializing a Table from a boolean pandas.DataFrame that is not in Fortran order, the contents of the resulting Table is different from the contents of the DataFrame.
Sample:
import pandas as pd import pyarrow as pa import numpy as np mask = np.full((3,3), False) mask[:,1] = True df = pd.DataFrame(mask) print(df) print(pa.table(df).to_pandas())
The output:
0 1 2 0 False True False 1 False True False 2 False True False 0 1 2 0 False True False 1 False False False 2 False False False
I.e., column 1 is different before and after roundtripping through pa.Table.
If I add order='F' to the np.full invocation, the result is as expected. Also, the problem seems to disappear if I use dtype=int.
Attachments
Issue Links
- is duplicated by
-
ARROW-6325 [Python] wrong conversion of DataFrame with boolean values
- Resolved