Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-15079 [C++] Add scheduler to constrain memory of exec plans
  3. ARROW-14330

[C++] Create DataHolder that can be used for caching during exec plans

    XMLWordPrintableJSON

Details

    Description

      The purpose of this task is to make an ExecNode that can provide the following functionality.

      1. Be able to obtain heuristics about our memory consumption and have a memory consumption threshold
      1. Be able to write incoming ExecBatch to disk if memory consumption is above the threshold, stores either the ExecBatch or a handle to file in a queue.
      1. Provide an api for pulling an ExecBatch from the queue. It should favor pulling all of the batches that are in memory first and then the ones that are handles to files.

       

      PRs to reference

      https://github.com/apache/arrow/pull/11017/

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              aocsa Alexander Ocsa
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 50m
                  1h 50m