Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
Description
The DataFrame API has a `collect` method which invokes the `collect(plan: Arc<dyn ExecutionPlan>) -> Result<Vec<RecordBatch>>` function which will collect records into a single vector of RecordBatches removing the partitioning via `MergeExec`.
The DataFrame should also expose the `collect_partitioned` method so that partitions can be maintained.
```
collect_partitioned(
plan: Arc<dyn ExecutionPlan>,
) -> Result<Vec<Vec<RecordBatch>>>
```
Attachments
Issue Links
- links to