Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-728

Datanodes should use different ContainerStateMachine for each pipeline.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.3.0
    • 0.3.0, 0.4.0
    • Ozone Filesystem
    • None

    Description

      Setup a 5 datanode ozone cluster with HDP on top of it.

      After restarting all HDP services few times encountered below issue which is making the HDP services to fail.

      Same exception was observed in an old setup but I thought it could have been issue with the setup but now encountered the same issue in new setup as well.

      2018-10-24 10:42:03,308 WARN org.apache.ratis.grpc.server.GrpcServerProtocolService: 2974da2b-e765-43f9-8d30-45fe40dcb9ab: Failed requestVote 1672d28e-800f-4318-895b-1648976acff6->2974da2b-e765-43f9-8d30-45fe40dcb9ab#0
      org.apache.ratis.protocol.GroupMismatchException: 2974da2b-e765-43f9-8d30-45fe40dcb9ab: group-CE87A994686F not found.
      at org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114)
      at org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:252)
      at org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:261)
      at org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:256)
      at org.apache.ratis.server.impl.RaftServerProxy.requestVote(RaftServerProxy.java:411)
      at org.apache.ratis.grpc.server.GrpcServerProtocolService.requestVote(GrpcServerProtocolService.java:54)
      at org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$MethodHandlers.invoke(RaftServerProtocolServiceGrpc.java:319)
      at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
      at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
      at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707)
      at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
      at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)
      2018-10-24 10:42:03,342 WARN org.apache.ratis.grpc.server.GrpcServerProtocolService: 2974da2b-e765-43f9-8d30-45fe40dcb9ab: Failed requestVote 7839294e-5657-447f-b320-6b390fffb963->2974da2b-e765-43f9-8d30-45fe40dcb9ab#0
      org.apache.ratis.protocol.GroupMismatchException: 2974da2b-e765-43f9-8d30-45fe40dcb9ab: group-CE87A994686F not found.
      at org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114)
      at org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:252)
      at org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:261)
      at org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:256)
      at org.apache.ratis.server.impl.RaftServerProxy.requestVote(RaftServerProxy.java:411)
      at org.apache.ratis.grpc.server.GrpcServerProtocolService.requestVote(GrpcServerProtocolService.java:54)
      at org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$MethodHandlers.invoke(RaftServerProtocolServiceGrpc.java:319)
      at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
      at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
      at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707)
      at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
      at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)
      2018-10-24 10:42:04,466 WARN org.apache.ratis.grpc.server.GrpcServerProtocolService: 2974da2b-e765-43f9-8d30-45fe40dcb9ab: Failed requestVote 1672d28e-800f-4318-895b-1648976acff6->2974da2b-e765-43f9-8d30-45fe40dcb9ab#0
      org.apache.ratis.protocol.GroupMismatchException: 2974da2b-e765-43f9-8d30-45fe40dcb9ab: group-CE87A994686F not found.
      at org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114)
      at org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:252)
      at org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:261)
      at org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:256)
      at org.apache.ratis.server.impl.RaftServerProxy.requestVote(RaftServerProxy.java:411)
      at org.apache.ratis.grpc.server.GrpcServerProtocolService.requestVote(GrpcServerProtocolService.java:54)
      at org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$MethodHandlers.invoke(RaftServerProtocolServiceGrpc.java:319)
      at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
      at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
      at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707)
      at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
      at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)
      

      Attachments

        1. HDDS-728-ozone-0.3.005.patch
          30 kB
          Mukul Kumar Singh
        2. hadoop-root-datanode-ctr-e138-1518143905142-552728-01-000006.hwx.site.log
          14.42 MB
          Soumitra Sulav
        3. hadoop-root-datanode-ctr-e138-1518143905142-552728-01-000008.hwx.site.log
          16.68 MB
          Soumitra Sulav
        4. hadoop-root-datanode-ctr-e138-1518143905142-552728-01-000007.hwx.site.log
          3.02 MB
          Soumitra Sulav
        5. hadoop-root-datanode-ctr-e138-1518143905142-552728-01-000005.hwx.site.log
          3.73 MB
          Soumitra Sulav
        6. hadoop-root-datanode-ctr-e138-1518143905142-541600-02-000008.hwx.site.log
          1.97 MB
          Soumitra Sulav
        7. hadoop-root-datanode-ctr-e138-1518143905142-552728-01-000004.hwx.site.log
          1.94 MB
          Soumitra Sulav
        8. hadoop-root-datanode-ctr-e138-1518143905142-541600-02-000010.hwx.site.log
          2.31 MB
          Soumitra Sulav
        9. hadoop-root-datanode-ctr-e138-1518143905142-541600-02-000009.hwx.site.log
          2.22 MB
          Soumitra Sulav
        10. HDDS-728.012.patch
          28 kB
          Mukul Kumar Singh
        11. HDDS-728.011.patch
          25 kB
          Mukul Kumar Singh
        12. HDDS-728.010.patch
          25 kB
          Mukul Kumar Singh
        13. HDDS-728.009.patch
          25 kB
          Mukul Kumar Singh
        14. HDDS-728.008.patch
          24 kB
          Mukul Kumar Singh
        15. HDDS-728.007.patch
          24 kB
          Mukul Kumar Singh
        16. HDDS-728.006.patch
          26 kB
          Mukul Kumar Singh
        17. HDDS-728.005.patch
          26 kB
          Mukul Kumar Singh
        18. HDDS-728.004.patch
          22 kB
          Mukul Kumar Singh
        19. HDDS-728.003.patch
          21 kB
          Mukul Kumar Singh
        20. HDDS-728.002.patch
          20 kB
          Mukul Kumar Singh
        21. HDDS-728.001.patch
          20 kB
          Mukul Kumar Singh
        22. hadoop-root-scm-ctr-e138-1518143905142-541600-02-000002.hwx.site.log
          2.00 MB
          Soumitra Sulav
        23. hadoop-root-datanode-ctr-e138-1518143905142-541600-02-000002.hwx.site.log
          468 kB
          Soumitra Sulav
        24. om-audit-ctr-e138-1518143905142-541600-02-000002.hwx.site.log
          76 kB
          Soumitra Sulav
        25. hadoop-root-om-ctr-e138-1518143905142-541600-02-000002.hwx.site.log
          31 kB
          Soumitra Sulav
        26. hadoop-root-datanode-ctr-e138-1518143905142-541600-02-000003.hwx.site.log
          1.11 MB
          Soumitra Sulav

        Activity

          People

            msingh Mukul Kumar Singh
            ssulav Soumitra Sulav
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: