Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-14177

[C++] Optimize dictionary support in kernels/Support nulls in DictionaryUnifier

Add voteWatch issue
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • C++

    Description

      ARROW-13358, ARROW-14167, ARROW-13573, etc. add support for dictionary types in various generic 'selection'-style kernels. However, the support is fairly unoptimized, repeatedly hashing and inserting the same values. Instead, we could implement more optimized support in some cases.

      For example, for if_else, we could first unify both dictionaries (in such a way that the first dictionary is unchanged), then copy-with-transpose the indices of the second argument, then copy the relevant indices of the first argument (which doesn't require transposition). This would require DictionaryUnifier to support nulls in dictionaries, however.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              lidavidm David Li

              Dates

                Created:
                Updated:

                Slack

                  Issue deployment