Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
None
Description
Currently, the scrubber is run as part of create pipeline.
When SCM is started, scrubber is coming up and cleaning up all the containers in SCM. Because when loading pipelines, the pipelineCreationTimeStamp is set from when the pipeline is created.
Because of this, below condition is satisfied and destroying all the pipelines when SCM is restarted. This can be easily reproduced start SCM, wait for 10 minutes and restart SCM.
List<Pipeline> needToSrubPipelines = stateManager.getPipelines(type, factor, Pipeline.PipelineState.ALLOCATED).stream() .filter(p -> currentTime.toEpochMilli() - p.getCreationTimestamp() .toEpochMilli() >= pipelineScrubTimeoutInMills) .collect(Collectors.toList()); for (Pipeline p : needToSrubPipelines) { LOG.info("srubbing pipeline: id: " + p.getId().toString() + " since it stays at ALLOCATED stage for " + Duration.between(currentTime, p.getCreationTimestamp()).toMinutes() + " mins."); finalizeAndDestroyPipeline(p, false); }
Log showing scrubbing of pipeline
2020-02-20 12:42:18,946 [RatisPipelineUtilsThread] INFO org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager: srubbing pipeline: id: PipelineID=35dff62d-9bfa-449b-b6e8-6f00cc8c1b6e since it stays at ALLOCATED stage for -1003 mins.
Attachments
Issue Links
- fixes
-
HDDS-3004 OM HA stability issues
- Resolved
- links to