Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Steps :
- Create vol/buck/key
- Simulate unhealthy replica in the container of above key
- Check for container to close
Expected behaviour - Container should be closed soon after it receives ICR of UNHEALTHY replica
Actual behaviour - Container is stuck in CLOSING state for more than 12 hours after receiving ICR
Container close initiated at -
2023-10-26 19:56:08,079 INFO [FixedThreadPoolWithAffinityExecutor-1-0]-org.apache.hadoop.hdds.scm.container.IncrementalContainerReportHandler: Moving OPEN container #18002 to CLOSING state, datanode f2a6be07-db06-430b-8311-534247744f99(quasar-yzwbdi-8.quasar-yzwbdi.root.hwx.site/172.27.112.2) reported UNHEALTHY replica with index 0.
Current state of container -
root@quasar-yzwbdi-1:~# ozone admin container info 18002 Container id: 18002 Pipeline id: 34771df9-8ba5-4a3e-9e48-abb590e67ea2 Container State: CLOSING Datanodes: [f2a6be07-db06-430b-8311-534247744f99/quasar-yzwbdi-8.quasar-yzwbdi.root.hwx.site, baa35af1-7b51-4275-b465-f750c429c618/quasar-yzwbdi-5.quasar-yzwbdi.root.hwx.site, f40aed3a-dddf-4f2b-a30f-035136bfceba/quasar-yzwbdi-4.quasar-yzwbdi.root.hwx.site] Replicas: [State: CLOSING; ReplicaIndex: 0; Origin: f40aed3a-dddf-4f2b-a30f-035136bfceba; Location: f40aed3a-dddf-4f2b-a30f-035136bfceba/quasar-yzwbdi-4.quasar-yzwbdi.root.hwx.site, State: UNHEALTHY; ReplicaIndex: 0; Origin: f2a6be07-db06-430b-8311-534247744f99; Location: f2a6be07-db06-430b-8311-534247744f99/quasar-yzwbdi-8.quasar-yzwbdi.root.hwx.site, State: CLOSING; ReplicaIndex: 0; Origin: baa35af1-7b51-4275-b465-f750c429c618; Location: baa35af1-7b51-4275-b465-f750c429c618/quasar-yzwbdi-5.quasar-yzwbdi.root.hwx.site] root@quasar-yzwbdi-1:~# date Fri 27 Oct 2023 04:57:29 AM UTC