Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-5626 Track and Address Flaky tests
  3. HDDS-10750

Intermittent fork timeout while stopping Ratis server

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • None
    • None
    • None

    Description

      [INFO] Running org.apache.hadoop.ozone.client.rpc.TestECKeyOutputStreamWithZeroCopy
      [INFO] 
      [INFO] Results:
      ...
      ... There was a timeout or other error in the fork
      
      "main" 
         java.lang.Thread.State: WAITING
              at java.lang.Object.wait(Native Method)
              at java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:405)
              ...
              at org.apache.hadoop.ozone.MiniOzoneClusterImpl.stopDatanodes(MiniOzoneClusterImpl.java:473)
              at org.apache.hadoop.ozone.MiniOzoneClusterImpl.stop(MiniOzoneClusterImpl.java:414)
              at org.apache.hadoop.ozone.MiniOzoneClusterImpl.shutdown(MiniOzoneClusterImpl.java:400)
              at org.apache.hadoop.ozone.client.rpc.AbstractTestECKeyOutputStream.shutdown(AbstractTestECKeyOutputStream.java:160)
      
      "ForkJoinPool.commonPool-worker-7" 
         java.lang.Thread.State: TIMED_WAITING
              ...
              at java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1475)
              at org.apache.ratis.util.ConcurrentUtils.shutdownAndWait(ConcurrentUtils.java:144)
              at org.apache.ratis.util.ConcurrentUtils.shutdownAndWait(ConcurrentUtils.java:136)
              at org.apache.ratis.server.impl.RaftServerProxy.lambda$close$9(RaftServerProxy.java:438)
              ...
              at org.apache.ratis.util.LifeCycle.checkStateAndClose(LifeCycle.java:304)
              at org.apache.ratis.server.impl.RaftServerProxy.close(RaftServerProxy.java:415)
              at org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.stop(XceiverServerRatis.java:603)
              at org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.stop(OzoneContainer.java:484)
              at org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.close(DatanodeStateMachine.java:447)
              at org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.stopDaemon(DatanodeStateMachine.java:637)
              at org.apache.hadoop.ozone.HddsDatanodeService.stop(HddsDatanodeService.java:550)
              at org.apache.hadoop.ozone.MiniOzoneClusterImpl.stopDatanode(MiniOzoneClusterImpl.java:479)
              at org.apache.hadoop.ozone.MiniOzoneClusterImpl$$Lambda$2077/645273703.accept(Unknown Source)
      
      "c7edee5d-bf3c-45a7-a783-e11562f208dc-impl-thread2" 
         java.lang.Thread.State: WAITING
              ...
              at java.util.concurrent.CompletableFuture.join(CompletableFuture.java:1947)
              at org.apache.ratis.server.impl.RaftServerImpl.lambda$close$3(RaftServerImpl.java:543)
              at org.apache.ratis.server.impl.RaftServerImpl$$Lambda$1925/263251010.run(Unknown Source)
              at org.apache.ratis.util.LifeCycle.lambda$checkStateAndClose$7(LifeCycle.java:306)
              at org.apache.ratis.util.LifeCycle$$Lambda$1204/655954062.get(Unknown Source)
              at org.apache.ratis.util.LifeCycle.checkStateAndClose(LifeCycle.java:326)
              at org.apache.ratis.util.LifeCycle.checkStateAndClose(LifeCycle.java:304)
              at org.apache.ratis.server.impl.RaftServerImpl.close(RaftServerImpl.java:525)
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              adoroszlai Attila Doroszlai
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: