Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
2.0.0
-
None
-
Python 3.7.4
Mac OS
Description
While I have not dug deep enough in the Arrow codebase, it seems to me that this is caused by the new numpy release: https://github.com/numpy/numpy/releases
The issue below in fact is not observed when using numpy 0.19.*
>>> pandas.__version__, pa.__version__, numpy.__version__ ('1.2.1', '2.0.0', '1.20.0') >>> df = pandas.DataFrame({'a': numpy.random.randn(10), 'b': numpy.random.randn(7).tolist() + [None, pandas.NA, numpy.nan], 'c': list(range(9)) + [numpy.nan]}) >>> pa.Table.from_pandas(df) Traceback (most recent call last): File "<input>", line 1, in <module> pa.Table.from_pandas(df) File "pyarrow/table.pxi", line 1394, in pyarrow.lib.Table.from_pandas File "/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py", line 588, in dataframe_to_arrays for c, f in zip(columns_to_convert, convert_fields)] File "/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py", line 588, in <listcomp> for c, f in zip(columns_to_convert, convert_fields)] File "/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py", line 574, in convert_column raise e File "/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py", line 568, in convert_column result = pa.array(col, type=type_, from_pandas=True, safe=safe) File "pyarrow/array.pxi", line 292, in pyarrow.lib.array File "pyarrow/array.pxi", line 79, in pyarrow.lib._ndarray_to_array File "pyarrow/array.pxi", line 67, in pyarrow.lib._ndarray_to_type File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion failed for column a with type float64')
Attachments
Issue Links
- duplicates
-
ARROW-10833 [Python] Avoid usage of NumPy's PyArray_DescrCheck macro
- Resolved