Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-10855

[Python][Numpy] ArrowTypeError after upgrading NumPy to 1.20.0rc1

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 2.0.0
    • 3.0.0
    • Python
    • None
    • macOS Big Sur 11.0.1

    Description

      After upgrading numpy to 1.20.0rc1 version, pandas .to_parquet() will raise ArrowTypeError. 

      NumPy 1.19.4, Python 3.7.9, macos: 

       

      Python 3.7.9 (default, Nov 20 2020, 23:58:42) 
      [Clang 12.0.0 (clang-1200.0.32.27)] on darwin
      Type "help", "copyright", "credits" or "license" for more information.
      >>> import numpy as np
      >>> import pandas as pd
      >>> np.__version__
      '1.19.4'
      >>> pd.DataFrame({'i': [1, 2, 3, np.nan]}, dtype='Int64').to_parquet('nullint.parquet')
      >>> 
      
      

      NumPy 1.20.0rc1, Python 3.7.9, macos: 

      Python 3.7.9 (default, Nov 20 2020, 23:58:42) 
      [Clang 12.0.0 (clang-1200.0.32.27)] on darwin
      Type "help", "copyright", "credits" or "license" for more information.
      >>> import numpy as np
      >>> import pandas as pd
      >>> np.__version__
      '1.20.0rc1'
      >>> pd.DataFrame({'i': [1, 2, 3, np.nan]}, dtype='Int64').to_parquet('nullint.parquet')
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File "/usr/local/lib/python3.7/site-packages/pandas/util/_decorators.py", line 199, in wrapper
          return func(*args, **kwargs)
        File "/usr/local/lib/python3.7/site-packages/pandas/core/frame.py", line 2372, in to_parquet
          **kwargs,
        File "/usr/local/lib/python3.7/site-packages/pandas/io/parquet.py", line 276, in to_parquet
          **kwargs,
        File "/usr/local/lib/python3.7/site-packages/pandas/io/parquet.py", line 101, in write
          table = self.api.Table.from_pandas(df, **from_pandas_kwargs)
        File "pyarrow/table.pxi", line 1394, in pyarrow.lib.Table.from_pandas
        File "/usr/local/lib/python3.7/site-packages/pyarrow/pandas_compat.py", line 588, in dataframe_to_arrays
          for c, f in zip(columns_to_convert, convert_fields)]
        File "/usr/local/lib/python3.7/site-packages/pyarrow/pandas_compat.py", line 588, in <listcomp>
          for c, f in zip(columns_to_convert, convert_fields)]
        File "/usr/local/lib/python3.7/site-packages/pyarrow/pandas_compat.py", line 574, in convert_column
          raise e
        File "/usr/local/lib/python3.7/site-packages/pyarrow/pandas_compat.py", line 568, in convert_column
          result = pa.array(col, type=type_, from_pandas=True, safe=safe)
        File "pyarrow/array.pxi", line 242, in pyarrow.lib.array
        File "pyarrow/array.pxi", line 110, in pyarrow.lib._handle_arrow_array_protocol
        File "/usr/local/lib/python3.7/site-packages/pandas/core/arrays/masked.py", line 227, in __arrow_array__
          return pa.array(self._data, mask=self._mask, type=type)
        File "pyarrow/array.pxi", line 292, in pyarrow.lib.array
        File "pyarrow/array.pxi", line 79, in pyarrow.lib._ndarray_to_array
        File "pyarrow/array.pxi", line 67, in pyarrow.lib._ndarray_to_type
        File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status
      pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion failed for column i with type Int64')
      >>> 
      

      Attachments

        Issue Links

          Activity

            People

              uwe Uwe Korn
              BarryJin Zhenghui Jin
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: