Currently: a fragment is a product of a scan; it is a lazy collection of scan tasks corresponding to a data source which is logically singular (like a single file, a single row group, ...). It would be more useful if instead a fragment were the direct object of a scan; one scans a fragment (or a collection of fragments):
- Remove ScanOptions from Fragment's properties and move it into Fragment::Scan parameters.
- Remove ScanOptions from Dataset::GetFragments. We can provide an overload to support predicate pushdown in FileSystemDataset and UnionDataset Dataset::GetFragments(std::shared_ptr<Expression> predicate).
- Expose lazy accessor to Fragment::physical_schema()
- Consolidate ScanOptions and ScanContext
This will lessen the cognitive dissonance between fragments and files since fragments will no longer include references to scan properties.