Details
-
Sub-task
-
Status: Open
-
Blocker
-
Resolution: Unresolved
-
None
Description
In regular writer we have a flow, where we detect if some MDT partition is not enabled, but the partition is found in storage and as part of table config's fully built out partitions, hudi deletes the metadata partition with the intent that user wishes to disable it.
But this does not sit well w/ async indexer.
process1 -> Deltastreamer runs continuously.
no metadata configs set.
which means, default value for metadata enable = true and hence "files" partition will be instantiated inline on first commit.
no value set for col stats enable. So, no action will be taken.
process2: user starts HoodieIndexer for col stats partition.
Once indexer completes, tableConfig will add "col stats" as part of fully built out metadata partition.
While in process1, when deltastreamer goes to next write, it will detect that col stats wasn't enabled (default value as per code), but tableConfig shows that col stats is fully built out, and hence decides to delete the col stats partition and updates the tableConfig.
Attachments
Issue Links
- links to