Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
A quite annoying regression (original report from https://github.com/pandas-dev/pandas/issues/33878), is that when specifying columns to read, this now fails if the order of the columns is not exactly the same as in the file:
In [27]: table = pa.table([[1, 2, 3], [4, 5, 6], [7, 8, 9]], names=['a', 'b', 'c']) In [29]: from pyarrow import feather In [30]: feather.write_feather(table, "test.feather") # this works fine In [32]: feather.read_table("test.feather", columns=['a', 'b']) Out[32]: pyarrow.Table a: int64 b: int64 In [33]: feather.read_table("test.feather", columns=['b', 'a']) --------------------------------------------------------------------------- ArrowInvalid Traceback (most recent call last) <ipython-input-33-e01caeabb389> in <module> ----> 1 feather.read_table("test.feather", columns=['b', 'a']) ~/scipy/repos/arrow/python/pyarrow/feather.py in read_table(source, columns, memory_map) 237 return reader.read_indices(columns) 238 elif all(map(lambda t: t == str, column_types)): --> 239 return reader.read_names(columns) 240 241 column_type_names = [t.__name__ for t in column_types] ~/scipy/repos/arrow/python/pyarrow/feather.pxi in pyarrow.lib.FeatherReader.read_names() ~/scipy/repos/arrow/python/pyarrow/error.pxi in pyarrow.lib.check_status() ArrowInvalid: Schema at index 0 was different: b: int64 a: int64 vs a: int64 b: int64
Attachments
Issue Links
- links to