Details
-
Task
-
Status: Closed
-
Major
-
Resolution: Resolved
-
None
-
None
Description
During the bootstrap of the Metadata Table, all the directories which contain the partition metadata directory are assumed to be partitions and are added to the metadata table.
In our HDFS clusters, we have directories like .backup, .temp which are used by various teams for non-hoodie purposes (e.g. .backup may be keeping a snapshot of the dataset). During bootstrap, Metadata Table ends up containing all those paths also as partitions.
In this patch, I would like to introduce a configuration for HoodieMetadataConfig to filter out some directories based on a regular expression string.
Attachments
Issue Links
- links to