Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.4.0
-
None
Description
File-source constant metadata columns are often derived indirectly from file-level metadata values rather than exposing those values directly. For example, _metadata.file_name is currently hard-coded in FileFormat.updateMetadataInternalRow as:
UTF8String.fromString(filePath.getName)
We should add support for metadata extractors, functions that map from PartitionedFile to Literal, so that we can express such columns in a generic way instead of hard-coding them.
We can't just add them to the metadata map because then they have to be pre-computed even if it turns out the query does not select that field.