Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-4138

Fix the concurrency modification of hoodie table config for flink

    XMLWordPrintableJSON

Details

    Description

      From [GH|https://github.com/apache/hudi/issues/5553] (by danny0405):

      ---------

      Have fired a fix for flink here: #5660

      https://issues.apache.org/jira/browse/HUDI-3782 and
      https://issues.apache.org/jira/browse/HUDI-4138 may cause this bug.

      The HoodieTable#getMetadataWriter is used by many async table service such as cleaning, compaction, clustering and so on, this method now would try to modify the table config each time it is called no matter whether metadata table is enabled/disabled.

      In general, we should never make any side effect in the read code path of hoodie table config.
      And hoodie table metadata writer.

      I'm not sure how to fix this on Spark side, have two ways to fix on my mind:

      1. make table config concurrency safe (not suggested because it is too heavy for a config)
      2. make sure the metadata cleaning only happens once for the whole Job lifetime (still risky because there may be multiple jobs, but with very small probability). I would suggest this way from my side.

      Attachments

        Issue Links

          Activity

            People

              danny0405 Danny Chen
              danny0405 Danny Chen
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: