Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-15215

[C++] Consolidate kernel data-copy utilities between replace_with_mask, case_when, coalesce, choose, fill_null_forward, fill_null_backward

    XMLWordPrintableJSON

Details

    Description

      All six kernels use two sets of otherwise very similar kernel utilities for copying slices of an array into an output array. However, there's no reason they can't use the same utilities.

      The first set are here: "CopyFixedWidth" https://github.com/apache/arrow/blob/bd356295f6beaba744a2c6b498455701f53a64f8/cpp/src/arrow/compute/kernels/scalar_if_else.cc#L1282-L1284

      The second set are here: "ReplaceWithMask::CopyData" https://github.com/apache/arrow/blob/bd356295f6beaba744a2c6b498455701f53a64f8/cpp/src/arrow/compute/kernels/vector_replace.cc#L208-L209 (This is a little confusing because the utilities are intertwined into the kernel implementation)

      They would need to be moved into a new header to share them between the codegen units. Also, their interfaces would need to be consolidated.

      Additionally, the utilities may be excessively verbose, or generate too much code for what they do. For instance, some of the utilities are templated out for every Arrow type. Instead, we could replace all instantiations for numbers, decimals, temporal types, and so on with a single one for FixedWidthType (an abstract base class). Care should be taken to evaluate the benchmarks for these kernels to ensure there is not a regression.

      Attachments

        Issue Links

          Activity

            People

              jabaribooker Jabari Booker
              lidavidm David Li
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4h 20m
                  4h 20m