Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.4.0
Description
Some HA integration tests intermittently fail to start OM/SCM service due to port conflicts:
OM
org.apache.ratis.util.ExitUtils$ExitException: Failed to start Grpc server at org.apache.ratis.util.ExitUtils.terminate(ExitUtils.java:141) at org.apache.ratis.util.ExitUtils.terminate(ExitUtils.java:151) at org.apache.ratis.grpc.server.GrpcService.startImpl(GrpcService.java:260) at org.apache.ratis.util.LifeCycle.startAndTransition(LifeCycle.java:270) at org.apache.ratis.server.RaftServerRpcWithProxy.start(RaftServerRpcWithProxy.java:72) at org.apache.ratis.server.impl.RaftServerProxy.startImpl(RaftServerProxy.java:394) at org.apache.ratis.util.LifeCycle.startAndTransition(LifeCycle.java:270) at org.apache.ratis.server.impl.RaftServerProxy.start(RaftServerProxy.java:387) at org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.start(OzoneManagerRatisServer.java:557) at org.apache.hadoop.ozone.om.OzoneManager.start(OzoneManager.java:1513) at org.apache.hadoop.ozone.MiniOzoneHAClusterImpl$Builder.createOMService(MiniOzoneHAClusterImpl.java:525) at org.apache.hadoop.ozone.MiniOzoneHAClusterImpl$Builder.build(MiniOzoneHAClusterImpl.java:426)
SCM
org.apache.ratis.util.ExitUtils$ExitException: Failed to start Grpc server at org.apache.ratis.util.ExitUtils.terminate(ExitUtils.java:141) at org.apache.ratis.util.ExitUtils.terminate(ExitUtils.java:151) at org.apache.ratis.grpc.server.GrpcService.startImpl(GrpcService.java:260) at org.apache.ratis.util.LifeCycle.startAndTransition(LifeCycle.java:270) at org.apache.ratis.server.RaftServerRpcWithProxy.start(RaftServerRpcWithProxy.java:72) at org.apache.ratis.server.impl.RaftServerProxy.startImpl(RaftServerProxy.java:394) at org.apache.ratis.util.LifeCycle.startAndTransition(LifeCycle.java:270) at org.apache.ratis.server.impl.RaftServerProxy.start(RaftServerProxy.java:387) at org.apache.hadoop.hdds.scm.ha.SCMRatisServerImpl.start(SCMRatisServerImpl.java:179) at org.apache.hadoop.hdds.scm.ha.SCMHAManagerImpl.start(SCMHAManagerImpl.java:102) at org.apache.hadoop.hdds.scm.server.StorageContainerManager.start(StorageContainerManager.java:1445) at org.apache.hadoop.ozone.MiniOzoneHAClusterImpl$Builder.createSCMService(MiniOzoneHAClusterImpl.java:604) at org.apache.hadoop.ozone.MiniOzoneHAClusterImpl$Builder.build(MiniOzoneHAClusterImpl.java:425)
- https://github.com/adoroszlai/ozone-build-results/blob/master/2022/12/16/19133/it-om/hadoop-ozone/integration-test/org.apache.hadoop.ozone.om.TestOMBucketLayoutUpgrade.txt
- https://github.com/adoroszlai/ozone-build-results/blob/master/2023/01/05/19362/it-scm/hadoop-ozone/integration-test/org.apache.hadoop.ozone.scm.TestStorageContainerManagerHA.txt
- https://github.com/adoroszlai/ozone-build-results/blob/master/2023/01/10/19472/it-ozone/hadoop-ozone/integration-test/org.apache.hadoop.ozone.shell.TestOzoneShellHA.txt
- https://github.com/adoroszlai/ozone-build-results/blob/master/2023/01/11/19484/it-ozone/hadoop-ozone/integration-test/org.apache.hadoop.ozone.TestMiniOzoneOMHACluster.txt
- https://github.com/adoroszlai/ozone-build-results/blob/master/2023/01/12/19510/it-scm/hadoop-ozone/integration-test/org.apache.hadoop.ozone.scm.TestFailoverWithSCMHA.txt
- https://github.com/adoroszlai/ozone-build-results/blob/master/2023/02/01/19852/it-scm/hadoop-ozone/integration-test/org.apache.hadoop.ozone.scm.TestStorageContainerManagerHA.txt
- https://github.com/adoroszlai/ozone-build-results/blob/master/2023/02/02/19862/it-om/hadoop-ozone/integration-test/org.apache.hadoop.ozone.om.TestOMRatisSnapshots.txt
Attachments
Issue Links
- relates to
-
HDDS-9881 Intermittent address already in use in TestSecureContainerServer
- Resolved
- links to