Description
For file system based data sources, implementing Hive style partitioning support can be complex and error prone. To be specific, partitioning support include:
- Partition discovery: Given a directory organized similar to Hive partitions, discover the directory structure and partitioning information automatically, including partition column names, data types, and values.
- Reading from partitioned tables
- Writing to partitioned tables
It would be good to have first class partitioning support in the data sources API. For example, add a FileBasedScan trait with callbacks and default implementations for these features.