Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.11.1
Description
Pyarrow arrays of string cannot be created from numpy arrays of string anymore for versions pyarrow>=0.8.0 (this includes pyarrow==0.11.1).
Please find below a quick repro:
import numpy as np import pyarrow as pa vec = np.array(["toto", "tata"]) pa.array(vec, pa.string())
Runing this I get the following:
--------------------------------------------------------------------------- ArrowInvalid Traceback (most recent call last) <ipython-input-4-e753fb3a8193> in <module>() ----> 1 pa.array(vec, pa.string()) /usr/local/lib/python2.7/dist-packages/pyarrow/lib.so in pyarrow.lib.array() /usr/local/lib/python2.7/dist-packages/pyarrow/lib.so in pyarrow.lib._ndarray_to_array() /usr/local/lib/python2.7/dist-packages/pyarrow/lib.so in pyarrow.lib.check_status() ArrowInvalid: 'utf32' codec can't decode bytes in position 0-3: code point not in range(0x110000)
However, this code snippet was working fine with pyarrow==0.7.1.
Was there any behavior change with string in pyarrow since 0.7.1?
Do you have any workaround for this?
Jacques
Attachments
Issue Links
- links to