Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-8199

[C++] Add support for multi-column sort on Table

    XMLWordPrintableJSON

Details

    • Wish
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 0.16.0
    • 3.0.0
    • C++

    Description

      I'm just coming up to speed with Arrow and am noticing a dearth of examples ... maybe I can help here.

      I'd like to implement multi-column sorting for Tables and just want to ensure that I'm not duplicating existing work or proposing a bad design.

      My thought was to create a Table-specific version of SortToIndices() where you can specify the columns and sort order.

      Then I'd create Array "views" that use the Indices to remap from the original Array values to the values in sorted order. (Original data is not sorted, but could be as a second step.) I noticed some of the array list variants keep offsets, but didn't see anything that supports remapping per a list of indices, but this may just be my oversight?

      Thanks in advance, Scott

      Attachments

        1. DataFrame.h
          75 kB
          Scott Wilson
        2. ArrowCsv.cpp
          31 kB
          Scott Wilson

        Issue Links

          Activity

            People

              kou Kouhei Sutou
              swilson314 Scott Wilson
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 5.5h
                  5.5h