Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.16.0
Description
This can be costly and is not always necessary.
At the same time we could move file validation into the scan tasks; currently all files are inspected as the dataset is constructed, which can be expensive if the filesystem is slow. We'll be performing the validation multiple times but the check will be cheap since at scan time we'll be reading the file into memory anyway.
Attachments
Issue Links
- relates to
-
ARROW-7673 [C++][Dataset] Revisit File discovery failure mode
- Resolved
- links to