Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-3067

Fix Bug in Scrub Pipeline causing destory pipelines after SCM restart

    XMLWordPrintableJSON

Details

    Description

      Currently, the scrubber is run as part of create pipeline. 

      When SCM is started, scrubber is coming up and cleaning up all the containers in SCM. Because when loading pipelines, the pipelineCreationTimeStamp is set from when the pipeline is created.

       

      Because of this, below condition is satisfied and destroying all the pipelines when SCM is restarted. This can be easily reproduced start SCM, wait for 10 minutes and restart SCM.

       

      List<Pipeline> needToSrubPipelines = stateManager.getPipelines(type, factor,
       Pipeline.PipelineState.ALLOCATED).stream()
       .filter(p -> currentTime.toEpochMilli() - p.getCreationTimestamp()
       .toEpochMilli() >= pipelineScrubTimeoutInMills)
       .collect(Collectors.toList());
      for (Pipeline p : needToSrubPipelines) {
       LOG.info("srubbing pipeline: id: " + p.getId().toString() +
       " since it stays at ALLOCATED stage for " +
       Duration.between(currentTime, p.getCreationTimestamp()).toMinutes() +
       " mins.");
       finalizeAndDestroyPipeline(p, false);
      }

       

      Log showing scrubbing of pipeline

       

      2020-02-20 12:42:18,946 [RatisPipelineUtilsThread] INFO org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager: srubbing pipeline: id: PipelineID=35dff62d-9bfa-449b-b6e8-6f00cc8c1b6e since it stays at ALLOCATED stage for -1003 mins.

       

       

       

      Attachments

        Issue Links

          Activity

            People

              bharat Bharat Viswanadham
              bharat Bharat Viswanadham
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m