Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
For filesystem partitioning, we want to use the existing directory structure of the data. So, if a selection is a directory that contains subdirectories, the name of the directory a given record was stored in can be included as a field in that record. For example, given this structure:
/data
/a
file.csv
/b
file.csv
select * from dfs.`/data`
will include a column named dir0, with possible values a and b. This can be extended to a hierarchy of partitions. For example,
/data
/a
/1
file.csv
/2
file.csv
/b
file.csv
would have columns dir0 (with possible values a and b) and dir1 (with possible values 1, 2 and null).
The data type will always be VARCHAR for the partition columns.