Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-2823 SCM HA Support
  3. HDDS-4125

Pipeline is not removed when a datanode goes stale

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • SCM HA

    Description

      When a node goes stale the pipelines in that node have to be closed and removed from PipelineManager. Currently, the pipeline is only closed and left in PipelineManager.

       

      Root Cause Analysis 

      Since the Scheduler in SCMPipelineManager that used to destroyPipeline is removed,

      scheduler.schedule(() -> destroyPipeline(pipeline),
          pipelineDestroyTimeoutInMillis, TimeUnit.MILLISECONDS, LOG,
          String.format("Destroy pipeline failed for pipeline:%s", pipeline));

      meanwhile the PipelineManagerV2Impl::scrubPipeline only handles and remove RATIS THREE pipeline,

      public void scrubPipeline(ReplicationType type, ReplicationFactor factor)
          throws IOException {
        checkLeader();
        if (type != ReplicationType.RATIS || factor != ReplicationFactor.THREE) {
          // Only srub pipeline for RATIS THREE pipeline
          return;
        }
      

       RATIS ONE Pipeline is closed but not removed when a datanode goes stale. The solution is let scrubPipeline handle all kinds of pipelines.

      Attachments

        Issue Links

          Activity

            People

              glengeng Glen Geng
              nanda Nandakumar
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: