Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-15545

[C++] Cast dictionary of extension type to extension type

    XMLWordPrintableJSON

Details

    Description

      We support casting a DictionaryArray to its dictionary values' type. For example:

      >>> arr = pa.array([1, 2, 1]).dictionary_encode()
      >>> arr
      <pyarrow.lib.DictionaryArray object at 0x7f0c1aca46d0>
      
      -- dictionary:
        [
          1,
          2
        ]
      -- indices:
        [
          0,
          1,
          0
        ]
      
      >>> arr.type
      DictionaryType(dictionary<values=int64, indices=int32, ordered=0>)
      >>> arr.cast(arr.type.value_type)
      <pyarrow.lib.Int64Array object at 0x7f0c19891dc0>
      [
        1,
        2,
        1
      ]
      

      However, if the type of the dictionary values is an ExtensionType, this cast is not supported:

      >>> from pyarrow.tests.test_extension_type import UuidType
      >>> storage = pa.array([b"0123456789abcdef"], type=pa.binary(16))
      >>> arr = pa.ExtensionArray.from_storage(UuidType(), storage)
      >>> arr
      <pyarrow.lib.ExtensionArray object at 0x7f0c1875bc40>
      [
        30313233343536373839616263646566
      ]
      >>> dict_arr = pa.DictionaryArray.from_arrays(pa.array([0, 0], pa.int32()), arr)
      >>> dict_arr.type
      DictionaryType(dictionary<values=extension<arrow.py_extension_type<UuidType>>, indices=int32, ordered=0>)
      >>> dict_arr.cast(UuidType())
      ...
      ArrowNotImplementedError: Unsupported cast from dictionary<values=extension<arrow.py_extension_type<UuidType>>, indices=int32, ordered=0> to extension<arrow.py_extension_type<UuidType>> (no available cast function for target type)
      ../src/arrow/compute/cast.cc:119  GetCastFunctionInternal(cast_options->to_type, args[0].type().get())
      
      

      Attachments

        Issue Links

          Activity

            People

              milesgranger Miles Granger
              jorisvandenbossche Joris Van den Bossche
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 11h 20m
                  11h 20m