[ARROW-15238] [C++] Create "engine" module for the query engine - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 8.0.0
Component/s: C++
Labels:
- pull-request-available
- query-engine

External issue URL:
https://github.com/apache/arrow/issues/30735

Description

Circular dependencies are popping up in the query engine as the compute module is very low level. For example, it would be nice if the default registry included the scan node and dataset write node. We will want to be adding spillover support at some point and that will rely on parquet/dataset operations.

We should create a dedicated engine module which includes the query plans, the nodes, etc. This module would not contain the kernels or other low level compute primitives. This way we could have something like...

engine -> datasets (for scanning) -> parquet -> compute (for calculating statistics)

The base ExecPlan itself could either go in compute or engine depending on which has the least amount of friction.

Attachments

Issue Links

is depended upon by

ARROW-15257 [C++] Simplify ExecPlan's C++ interface

Open

links to

GitHub Pull Request #11707

GitHub Pull Request #12279

Activity

People

Assignee:: Jeroen van Straten

Reporter:: Weston Pace

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 03/Jan/22 19:47

Updated:: 11/Jan/23 08:45

Resolved:: 16/Feb/22 02:22

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

[C++] Create "engine" module for the query engine