Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-13263

[C++][Compute] Allow Fragments to attach guarantees to scanned batches

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • C++
    • None

    Description

      A Fragment may be able to attach guarantee expressions to individual batches beyond just its partition expression. For example, a parquet fragment can attach row group statistics. These guarantees can be leveraged by subsequent ExecNodes to optimize execution, for example to skip execution of unnecessary filter expressions.

      In its simplest form this is probably an overload of ScanBatchesAsync which yields ExecBatches (which have a guarantee property) instead of RecordBatches. When transforming to an ExecBatch, (see expression.h::MakeExecBatch it'd be useful to also eagerly drop columns which are not referenced by any nodes in the graph- just in case the Fragment couldn't push the projection down any further.

      Attachments

        Activity

          People

            Unassigned Unassigned
            bkietz Ben Kietzman
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: