Uploaded image for project: 'Hadoop Distributed Data Store'
  1. Hadoop Distributed Data Store
  2. HDDS-400

Check global replication state for containers of dead node

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.2.1
    • Component/s: SCM
    • Labels:
      None

      Description

      Current container replication handler compare the reported containers with the previous report. It handles over an under replicated state.

      But there is no logic to check the cluster-wide replication count. If a node is went down it won't be detected.

      For the sake of simplicity I would add this check to the ContainerReportHandler (as of now). So all the reported container should have enough replicas.

      We can check the performance implication with genesis, but as a first implementation I think it could be good enough.

      ----- After Jira discussion below, the final patch does the following: -----
      When a dead node is reported, the DeadNodeHandler checks the replication state for all the containers in that node and fires replication events in case of under/over-replicated blocks.

        Attachments

        1. HDDS-400.001.patch
          16 kB
          Marton Elek
        2. HDDS-400.002.patch
          16 kB
          Marton Elek
        3. HDDS-400.004.patch
          15 kB
          Marton Elek
        4. HDDS-400.005.patch
          27 kB
          Marton Elek

          Activity

            People

            • Assignee:
              elek Marton Elek
              Reporter:
              elek Marton Elek
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: