Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-2823 SCM HA Support
  3. HDDS-4125

Pipeline is not removed when a datanode goes stale

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: SCM HA

      Description

      When a node goes stale the pipelines in that node have to be closed and removed from PipelineManager. Currently, the pipeline is only closed and left in PipelineManager.

       

      Root Cause Analysis 

      Since the Scheduler in SCMPipelineManager that used to destroyPipeline is removed,

      scheduler.schedule(() -> destroyPipeline(pipeline),
          pipelineDestroyTimeoutInMillis, TimeUnit.MILLISECONDS, LOG,
          String.format("Destroy pipeline failed for pipeline:%s", pipeline));

      meanwhile the PipelineManagerV2Impl::scrubPipeline only handles and remove RATIS THREE pipeline,

      public void scrubPipeline(ReplicationType type, ReplicationFactor factor)
          throws IOException {
        checkLeader();
        if (type != ReplicationType.RATIS || factor != ReplicationFactor.THREE) {
          // Only srub pipeline for RATIS THREE pipeline
          return;
        }
      

       RATIS ONE Pipeline is closed but not removed when a datanode goes stale. The solution is let scrubPipeline handle all kinds of pipelines.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                glengeng Glen Geng
                Reporter:
                nanda Nanda kumar
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: