Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-666

[Python] Error in DictionaryArray __repr__

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • 0.3.0
    • Python
    • None

    Description

      d.py is

      import pandas as pd
      import pyarrow as pa
      
      vals = pd.Series(['green', 'red', 'blue'] * 4)
      d = pd.Series(['red', 'green', 'blue'])
      print(d)
      ind = vals.apply(pd.Index(d).get_loc)
      print(vals)
      print(ind)
      
      v = pa.DictionaryArray.from_arrays(ind, d)
      print(v)
      

      Running this results in the following error:

      ╭─miki@xubi  python (git:ARROW-539) 
      ╰─<venv:python/venv>$ python d.py                             Mon Mar 20, 12:12 1 ↵
      0      red
      1    green
      2     blue
      dtype: object
      0     green
      1       red
      2      blue
      3     green
      4       red
      5      blue
      6     green
      7       red
      8      blue
      9     green
      10      red
      11     blue
      dtype: object
      0     1
      1     0
      2     2
      3     1
      4     0
      5     2
      6     1
      7     0
      8     2
      9     1
      10    0
      11    2
      dtype: int64
      Traceback (most recent call last):
        File "d.py", line 12, in <module>
          print(v)
        File "pyarrow/array.pyx", line 215, in pyarrow.array.Array.__repr__ (/home/miki/work/2sigma/arrow/python/build/temp.linux-x86_64-3.6/array.cxx:4132)
          values = array_format(self, window=10)
        File "/home/miki/work/2sigma/arrow/python/pyarrow/formatting.py", line 27, in array_format
          for x in arr:
        File "pyarrow/array.pyx", line 209, in __iter__ (/home/miki/work/2sigma/arrow/python/build/temp.linux-x86_64-3.6/array.cxx:3959)
          yield self.getitem(i)
        File "pyarrow/array.pyx", line 255, in pyarrow.array.Array.getitem (/home/miki/work/2sigma/arrow/python/build/temp.linux-x86_64-3.6/array.cxx:4825)
          return scalar.box_scalar(self.type, self.sp_array, i)
        File "pyarrow/scalar.pyx", line 246, in pyarrow.scalar.box_scalar (/home/miki/work/2sigma/arrow/python/build/temp.linux-x86_64-3.6/scalar.cxx:4522)
          val = _scalar_classes[type.type.type]()
      KeyError: 25
      ╭─miki@xubi  python (git:ARROW-539) 
      ╰─<venv:python/venv>$  
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            tebeka Miki Tebeka
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: