Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-2823 SCM HA Support
  3. HDDS-4810

Integrate DeleteBlockLog with PartialTableCache

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Not A Problem
    • Affects Version/s: 1.2.0
    • Fix Version/s: None
    • Component/s: SCM HA
    • Labels:
      None
    • Release Note:
      Resolve it, since both retry count and delete transaction can be correctly handled. We may not be able to scan out new added transaction immediately, but it is not problem.

      Description

      After SCM Snapshot is merged, writes to rocksDB are queued in DBTransactionBuffer, and committed to rocksDB when SCMDBTransactionBuffer#flush() is called in SCMStateMachine#takeSnapshot(). 

      Assume the new added/deleted/updated DeletedBlocksTransaction are queued in DBTransactionBuffer, there will be two issues:

      • increaseRetryCountOfTransactionInDB() will read a stale retryCount, and execute FAA(Fetch and Add) logic on a stale value.
      • SCMBlockDeletingService 1) can not read the new added txn, 2) read the already deleted txn

      We will enable PartialTableCache of DeletedBlocksTXTable. All the add/delete/update will update both PartialTableCache and DBTransactionBuffer, read operation will first check the cache, then the table if cache missed. The cache will be evicted when SCMDBTransactionBuffer#flush() is called.

      The cache and table are modified by StateMachineUpdater, which applies committed operations in a sequential order, so the correctness is apparent. 

      For the above issue:

      • increaseRetryCountOfTransactionInDB() is executed in StateMachineUpdater, it will always read the latest retryCount.
      • SCMBlockDeletingService scans the table via an iterator, we need first return the txns in cache, then return the txns in table but not in cache. Attention, while SCMBlockDeletingService is scanning the cache and table, StateMachineUpdater might update the cache and table simultaneously, which is not a problem, SCMBlockDeletingService can safely ignore the update during the scan, since the scan is a periodical task.

      For detail, refer to https://docs.google.com/document/d/166Aea2EowSGWtAFWNlDv0gu4rA06dQ2rJAsBd-l210Q/edit?usp=sharing

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              glengeng Glen Geng
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: