Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
I don't believe this is meant to be internal. pyarrow.parquet.write_to_dataset uses this (if use_legacy_dataset=False) but the parquet API doesn't expose the same features. A new example should also probably be added to the Tabular Datasets section of the docs explaining why write_dataset can take in a scanner (e.g. memory preserving, ability to write a dataset from flight or any record batch source, etc.)
Attachments
Issue Links
- is duplicated by
-
ARROW-13207 [Python][Doc] Dataset documentation still suggests deprecated scan method as the preferred iterative approach
- Closed
- links to