Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
8.0.0
Description
As part of ARROW-15271, map_batches() was modified to return a RecordBatchReader, but the implementation collects all results as a list of record batches and then converts that to a reader. In theory, if we push the implementation down to C++, we should be able to make a proper streaming RBR.
We won't know the schema ahead of time. We could optionally accept it, which would allow the function to be lazy. Or we could eagerly evaluate just the first batch to determine the schema.
Attachments
Issue Links
- links to