[ARROW-9832] [Rust] [DataFusion] Refactor PhysicalPlan to remove Partition - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.0.0
Component/s: Rust, Rust - DataFusion
Labels:
None

External issue URL:
https://github.com/apache/arrow/issues/25872

Description

As a step towards supporting an improved threading model, I would like to refactor to remove the redundant `Partition` trait. The implementations of these partition traits really just duplicate the state of their operator and just add the partition number. It would be better to just pass the partition number to the execute() method in the PhysicalPlan trait.

This means it will also be necessary for each ExecutionPlan to state its output partitioning (and this is needed for other reasons when we get into the physical optimizer).

Proposed trait:

/// Partition-aware execution plan for a relation
pub trait ExecutionPlan: Debug {
    /// Get the schema for this execution plan
    fn schema(&self) -> SchemaRef;
    /// Specifies the output partitioning of this execution plan
    fn output_partitioning(&self) -> Partitioning;
    /// Execute this plan for a single partition and return a stream of results
    fn execute(&self, partition: usize) -> Result<Arc<Mutex<dyn RecordBatchReader + Send + Sync>>>;
}

/// Partitioning schemes supported by operators.
#[derive(Debug, Clone)]
pub enum Partitioning {
    UnknownPartitioning(usize),
}

Attachments

Activity

People

Assignee:: Andy Grove

Reporter:: Andy Grove

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 22/Aug/20 20:05

Updated:: 11/Jan/23 08:09

Resolved:: 25/Aug/20 01:31

Agile

View on Board

[Rust] [DataFusion] Refactor PhysicalPlan to remove Partition