Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
None
-
None
Description
d.py is
import pandas as pd import pyarrow as pa vals = pd.Series(['green', 'red', 'blue'] * 4) d = pd.Series(['red', 'green', 'blue']) print(d) ind = vals.apply(pd.Index(d).get_loc) print(vals) print(ind) v = pa.DictionaryArray.from_arrays(ind, d) print(v)
Running this results in the following error:
╭─miki@xubi python (git:ARROW-539) ╰─<venv:python/venv>$ python d.py Mon Mar 20, 12:12 1 ↵ 0 red 1 green 2 blue dtype: object 0 green 1 red 2 blue 3 green 4 red 5 blue 6 green 7 red 8 blue 9 green 10 red 11 blue dtype: object 0 1 1 0 2 2 3 1 4 0 5 2 6 1 7 0 8 2 9 1 10 0 11 2 dtype: int64 Traceback (most recent call last): File "d.py", line 12, in <module> print(v) File "pyarrow/array.pyx", line 215, in pyarrow.array.Array.__repr__ (/home/miki/work/2sigma/arrow/python/build/temp.linux-x86_64-3.6/array.cxx:4132) values = array_format(self, window=10) File "/home/miki/work/2sigma/arrow/python/pyarrow/formatting.py", line 27, in array_format for x in arr: File "pyarrow/array.pyx", line 209, in __iter__ (/home/miki/work/2sigma/arrow/python/build/temp.linux-x86_64-3.6/array.cxx:3959) yield self.getitem(i) File "pyarrow/array.pyx", line 255, in pyarrow.array.Array.getitem (/home/miki/work/2sigma/arrow/python/build/temp.linux-x86_64-3.6/array.cxx:4825) return scalar.box_scalar(self.type, self.sp_array, i) File "pyarrow/scalar.pyx", line 246, in pyarrow.scalar.box_scalar (/home/miki/work/2sigma/arrow/python/build/temp.linux-x86_64-3.6/scalar.cxx:4522) val = _scalar_classes[type.type.type]() KeyError: 25 ╭─miki@xubi python (git:ARROW-539) ╰─<venv:python/venv>$