Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Using arrow commit 4fa7ac4 and parquet-cpp commit 0024665, I have
In [1]: from pyarrow import parquet In [2]: t = parquet.read_table('/Users/christophercaycock/Desktop/sample.parquet') In [3]: t.to_pandas() Out[3]: age name 0 1 A 1 2 B 2 3 C In [4]: t = parquet.read_table('/Users/christophercaycock/Desktop/sample.parquet', columns=['age']) --------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-4-5cf213819489> in <module>() ----> 1 t = parquet.read_table('/Users/christophercaycock/Desktop/sample.parquet', columns=['age']) /Users/christophercaycock/Desktop/arrow/python/pyarrow/parquet.pyx in pyarrow.parquet.read_table (/Users/christophercaycock/Desktop/arrow/python/build/temp.macosx-10.6-x86_64-3.5/parquet.cxx:2693)() 143 return reader.read_all() 144 else: --> 145 column_idxs = [reader.column_name_idx(column) for column in columns] 146 arrays = [reader.read_column(column_idx) for column_idx in column_idxs] 147 return Table.from_arrays(columns, arrays) /Users/christophercaycock/Desktop/arrow/python/pyarrow/parquet.pyx in pyarrow.parquet.ParquetReader.column_name_idx (/Users/christophercaycock/Desktop/arrow/python/build/temp.macosx-10.6-x86_64-3.5/parquet.cxx:2232)() 102 self.column_idx_map[str(metadata.schema().Column(i).path().get().ToDotString())] = i 103 --> 104 return self.column_idx_map[column_name] 105 106 def read_column(self, int column_index): KeyError: 'age'
This happens on both Mac and Linux.