Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
The theme of this overall epic to make the plan and expression rewriting phases of DataFusion more efficient by avoiding copies by leveraging the Rust type system
Benefits:
- More standard / idomatic Rust usage
- faster / more efficient (I don't have numbers to back this up)
Downsides:
- These will be backwards incompatible changes
Background
Many things in DataFusion look like
Input -tranformation->output
And the input is not used again. In rust, you can model this by giving ownership to the transformation
At a high level the idea is to avoid so much cloning in DataFustion
The basic principle is if the function needs to `clone` one of its arguments, the caller should be given the choice of when to do that. Often, the caller can give up ownership without issue
I envision at least the following the following items:
1. Optimizer passes that take `&LogicalPlan` and produce a new `LogicalPlan` even though most callsites do not need the original
2. Expr builder calls that take `&expr` and return a new `Expr`
3. An expression rewriter (TODO) while running down optimizer passes
I think this style takes advantage of Rust's ownership model and will let us avoid a lot o copying and allocations and avoid the need for something like slab allocators
Attachments
Issue Links
- relates to
-
ARROW-10243 [Rust] [Datafusion] Optimize literal expression evaluation
- Closed