Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-2159

Supporting Clustering and Metadata Table together

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • None
    • 0.10.0
    • None
    • None

    Description

      I am testing clustering support for metadata enabled table and found a few issues.

      Setup

      Pipeline 1: Ingestion pipeline with Metadata Table enabled. Runs every 30 mins. 
      Pipeline 2: Clustering pipeline with long running jobs (3-4 hours)
      Pipeline 3: Another clustering pipeline with long running jobs (3-4 hours)

       

      Issue #1: Parallel commits on Metadata Table

      Assume the Clustering pipeline is completing T5.replacecommit and ingestion pipeline is completing T10.commit. Metadata Table will synced at an instant <T5 (Say T4) since it only sync in completion order.

      Now both the pipelines will call syncMetadataTable() which will do the following:

      1. Find all un-synced instants from dataset (T5, T6 ... T10)
      2. Read each instant and perform a deltacommit on the Metadata Table with the same timestamp as instant.

      There is a chance that two processed perform deltacommit at T5 on the metadata table and one will fail (instant file already exists). This will be an exception raised and will be detected as failure of pipeline leading to false-positive alerts.

       

      Issue #2: No archiving/rollback support for failed clustering operations

      If a clustering operation fails, it leaves a left-over T5.replacecommit.inflight. There is no automated way to rollback or archive these. Since clustering is a long running operation in general and may be run through multiple pipelines at the same time, automated rollback of left-over inflights doesnt work as we cannot be sure that the process is dead.

      Metadata Table sync only works in completion order. So if T5.replacecommit.inflight is left-over, Metadata Table will not sync beyond T5 causing a large number of LogBLocks to pile up which will have to be merged in memory leading to deteriorating performance.

       

      Attachments

        Activity

          People

            pwason Prashant Wason
            pwason Prashant Wason
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: