Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-14202

[C++] A more RAM-efficient top-k sink node

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 7.0.0
    • None
    • C++

    Description

      Mentioned here:

      https://github.com/apache/arrow/pull/11274#pullrequestreview-768267959

      For example, a top-k implementation could periodically (when batches_ has some configurable # of rows) run through and discard data. The way it is written now it would still require me to buffer the entire dataset in memory (and/or spillover).

       

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              aocsa Alexander Ocsa
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: