Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
In _table_to_blocks (pandas_compat.py) the input extension_columns is equal to
{None: interval[int64, right]}for pd.interval_range and so an error is triggered as None can not be encoded. Same happens for pd.PeriodIndex.
Example:
import pandas as pd import pyarrow as pa df = pd.DataFrame(index=pd.interval_range(start=0, end=5)) table = pa.table(df) table.to_pandas()
Error:
TypeError Traceback (most recent call last) /var/folders/gw/q7wqd4tx18n_9t4kbkd0bj1m0000gn/T/ipykernel_13963/1439451337.py in <module> 1 df5 = pd.DataFrame(index=pd.PeriodIndex(year=[2000, 2002], quarter=[1, 3])) 2 table5 = pa.table(df5) ----> 3 table5.to_pandas().shape ~/repos/arrow/python/pyarrow/array.pxi in pyarrow.lib._PandasConvertible.to_pandas() 764 self_destruct=self_destruct 765 ) --> 766 return self._to_pandas(options, categories=categories, 767 ignore_metadata=ignore_metadata, 768 types_mapper=types_mapper) ~/repos/arrow/python/pyarrow/table.pxi in pyarrow.lib.Table._to_pandas() 1819 types_mapper=None): 1820 from pyarrow.pandas_compat import table_to_blockmanager -> 1821 mgr = table_to_blockmanager( 1822 options, self, categories, 1823 ignore_metadata=ignore_metadata, ~/repos/arrow/python/pyarrow/pandas_compat.py in table_to_blockmanager(options, table, categories, ignore_metadata, types_mapper) 787 _check_data_column_metadata_consistency(all_columns) 788 columns = _deserialize_column_index(table, all_columns, column_indexes) --> 789 blocks = _table_to_blocks(options, table, categories, ext_columns_dtypes) 790 791 axes = [columns, index] ~/repos/arrow/python/pyarrow/pandas_compat.py in _table_to_blocks(options, block_table, categories, extension_columns) 1133 # Convert an arrow table to Block from the internal pandas API 1134 columns = block_table.column_names -> 1135 result = pa.lib.table_to_blocks(options, block_table, categories, 1136 list(extension_columns.keys())) 1137 return [_reconstruct_block(item, columns, extension_columns) ~/repos/arrow/python/pyarrow/table.pxi in pyarrow.lib.table_to_blocks() 1215 c_options.categorical_columns = {tobytes(cat) for cat in categories} 1216 if extension_columns is not None: -> 1217 c_options.extension_columns = {tobytes(col) 1218 for col in extension_columns} 1219 ~/repos/arrow/python/pyarrow/lib.cpython-39-darwin.so in set.from_py.__pyx_convert_unordered_set_from_py_std_3a__3a_string() ~/repos/arrow/python/pyarrow/lib.cpython-39-darwin.so in string.from_py.__pyx_convert_string_from_py_std__in_string() TypeError: expected bytes, NoneType found
Attachments
Issue Links
- links to