Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-1611

Allow directories to be filtered during the bootstrap of the metadata table

    XMLWordPrintableJSON

Details

    • Task
    • Status: Closed
    • Major
    • Resolution: Resolved
    • None
    • 0.8.0
    • None

    Description

      During the bootstrap of the Metadata Table, all the directories which contain the partition metadata directory are assumed to be partitions and are added to the metadata table.

      In our HDFS clusters, we have directories like .backup, .temp which are used by various teams for non-hoodie purposes (e.g. .backup may be keeping a snapshot of the dataset). During bootstrap, Metadata Table ends up containing all those paths also as partitions.

      In this patch, I would like to introduce a configuration for HoodieMetadataConfig to filter out some directories based on a regular expression string. 

      Attachments

        Issue Links

          Activity

            People

              pwason Prashant Wason
              pwason Prashant Wason
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: