Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-9664

[Python] Array/ChunkedArray.to_pandas do not support types_mapper keyword

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.0.0
    • 8.0.0
    • Python
    • pyarrow: 1.0.0
      pandas: 1.0.5
      python: sys.version_info(major=3, minor=8, micro=2, releaselevel='final', serial=0)

    Description

      Arrow structures (Array, ChunkedArray, Table) have a types_mapper argument in their to_pandas method. It works for Table, but doesn't seem to get called for Array or ChunkedArray:

      import pandas as pd
      import pyarrow
      
      data = pd.Series([0, None, 2], dtype=pd.Int32Dtype(), name='foo')
      
      def convert_types(arrow_type):
           raise ValueError("Function got called")
      
      
      pyarrow.Table.from_pandas(data.to_frame()).to_pandas(types_mapper=convert_types)
      
      Traceback (most recent call last):
        File "/home/adrien/.pyenv/versions/complete/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3331, in run_code
          exec(code_obj, self.user_global_ns, self.user_ns)
        File "<ipython-input-6-d1e3e1f45f69>", line 1, in <module>
          pyarrow.Table.from_pandas(data.to_frame()).to_pandas(types_mapper=convert_types)
        File "pyarrow/array.pxi", line 715, in pyarrow.lib._PandasConvertible.to_pandas
        File "pyarrow/table.pxi", line 1565, in pyarrow.lib.Table._to_pandas
        File "/home/adrien/.pyenv/versions/complete/lib/python3.8/site-packages/pyarrow/pandas_compat.py", line 771, in table_to_blockmanager
          ext_columns_dtypes = _get_extension_dtypes(
        File "/home/adrien/.pyenv/versions/complete/lib/python3.8/site-packages/pyarrow/pandas_compat.py", line 840, in _get_extension_dtypes
          pandas_dtype = types_mapper(typ)
        File "<ipython-input-5-5a9760e8753f>", line 2, in convert_types
          raise ValueError("Function got called")
      ValueError: Function got called
      
      pyarrow.Int32Array.from_pandas(data).to_pandas(types_mapper=convert_types)
      
      0    0.0
      1    NaN
      2    2.0
      dtype: float64

      Attachments

        Issue Links

          Activity

            People

              alenka Alenka Frim
              Adrien_ Adrien Hoarau
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4h 10m
                  4h 10m