Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-728

Datanodes should use different ContainerStateMachine for each pipeline.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.3.0
    • 0.3.0, 0.4.0
    • Ozone Filesystem
    • None

    Description

      Setup a 5 datanode ozone cluster with HDP on top of it.

      After restarting all HDP services few times encountered below issue which is making the HDP services to fail.

      Same exception was observed in an old setup but I thought it could have been issue with the setup but now encountered the same issue in new setup as well.

      2018-10-24 10:42:03,308 WARN org.apache.ratis.grpc.server.GrpcServerProtocolService: 2974da2b-e765-43f9-8d30-45fe40dcb9ab: Failed requestVote 1672d28e-800f-4318-895b-1648976acff6->2974da2b-e765-43f9-8d30-45fe40dcb9ab#0
      org.apache.ratis.protocol.GroupMismatchException: 2974da2b-e765-43f9-8d30-45fe40dcb9ab: group-CE87A994686F not found.
      at org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114)
      at org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:252)
      at org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:261)
      at org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:256)
      at org.apache.ratis.server.impl.RaftServerProxy.requestVote(RaftServerProxy.java:411)
      at org.apache.ratis.grpc.server.GrpcServerProtocolService.requestVote(GrpcServerProtocolService.java:54)
      at org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$MethodHandlers.invoke(RaftServerProtocolServiceGrpc.java:319)
      at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
      at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
      at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707)
      at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
      at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)
      2018-10-24 10:42:03,342 WARN org.apache.ratis.grpc.server.GrpcServerProtocolService: 2974da2b-e765-43f9-8d30-45fe40dcb9ab: Failed requestVote 7839294e-5657-447f-b320-6b390fffb963->2974da2b-e765-43f9-8d30-45fe40dcb9ab#0
      org.apache.ratis.protocol.GroupMismatchException: 2974da2b-e765-43f9-8d30-45fe40dcb9ab: group-CE87A994686F not found.
      at org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114)
      at org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:252)
      at org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:261)
      at org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:256)
      at org.apache.ratis.server.impl.RaftServerProxy.requestVote(RaftServerProxy.java:411)
      at org.apache.ratis.grpc.server.GrpcServerProtocolService.requestVote(GrpcServerProtocolService.java:54)
      at org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$MethodHandlers.invoke(RaftServerProtocolServiceGrpc.java:319)
      at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
      at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
      at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707)
      at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
      at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)
      2018-10-24 10:42:04,466 WARN org.apache.ratis.grpc.server.GrpcServerProtocolService: 2974da2b-e765-43f9-8d30-45fe40dcb9ab: Failed requestVote 1672d28e-800f-4318-895b-1648976acff6->2974da2b-e765-43f9-8d30-45fe40dcb9ab#0
      org.apache.ratis.protocol.GroupMismatchException: 2974da2b-e765-43f9-8d30-45fe40dcb9ab: group-CE87A994686F not found.
      at org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114)
      at org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:252)
      at org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:261)
      at org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:256)
      at org.apache.ratis.server.impl.RaftServerProxy.requestVote(RaftServerProxy.java:411)
      at org.apache.ratis.grpc.server.GrpcServerProtocolService.requestVote(GrpcServerProtocolService.java:54)
      at org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$MethodHandlers.invoke(RaftServerProtocolServiceGrpc.java:319)
      at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
      at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
      at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707)
      at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
      at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)
      

      Attachments

        1. hadoop-root-datanode-ctr-e138-1518143905142-541600-02-000002.hwx.site.log
          468 kB
          Soumitra Sulav
        2. hadoop-root-datanode-ctr-e138-1518143905142-541600-02-000003.hwx.site.log
          1.11 MB
          Soumitra Sulav
        3. hadoop-root-datanode-ctr-e138-1518143905142-541600-02-000008.hwx.site.log
          1.97 MB
          Soumitra Sulav
        4. hadoop-root-datanode-ctr-e138-1518143905142-541600-02-000009.hwx.site.log
          2.22 MB
          Soumitra Sulav
        5. hadoop-root-datanode-ctr-e138-1518143905142-541600-02-000010.hwx.site.log
          2.31 MB
          Soumitra Sulav
        6. hadoop-root-datanode-ctr-e138-1518143905142-552728-01-000004.hwx.site.log
          1.94 MB
          Soumitra Sulav
        7. hadoop-root-datanode-ctr-e138-1518143905142-552728-01-000005.hwx.site.log
          3.73 MB
          Soumitra Sulav
        8. hadoop-root-datanode-ctr-e138-1518143905142-552728-01-000006.hwx.site.log
          14.42 MB
          Soumitra Sulav
        9. hadoop-root-datanode-ctr-e138-1518143905142-552728-01-000007.hwx.site.log
          3.02 MB
          Soumitra Sulav
        10. hadoop-root-datanode-ctr-e138-1518143905142-552728-01-000008.hwx.site.log
          16.68 MB
          Soumitra Sulav
        11. hadoop-root-om-ctr-e138-1518143905142-541600-02-000002.hwx.site.log
          31 kB
          Soumitra Sulav
        12. hadoop-root-scm-ctr-e138-1518143905142-541600-02-000002.hwx.site.log
          2.00 MB
          Soumitra Sulav
        13. HDDS-728.001.patch
          20 kB
          Mukul Kumar Singh
        14. HDDS-728.002.patch
          20 kB
          Mukul Kumar Singh
        15. HDDS-728.003.patch
          21 kB
          Mukul Kumar Singh
        16. HDDS-728.004.patch
          22 kB
          Mukul Kumar Singh
        17. HDDS-728.005.patch
          26 kB
          Mukul Kumar Singh
        18. HDDS-728.006.patch
          26 kB
          Mukul Kumar Singh
        19. HDDS-728.007.patch
          24 kB
          Mukul Kumar Singh
        20. HDDS-728.008.patch
          24 kB
          Mukul Kumar Singh
        21. HDDS-728.009.patch
          25 kB
          Mukul Kumar Singh
        22. HDDS-728.010.patch
          25 kB
          Mukul Kumar Singh
        23. HDDS-728.011.patch
          25 kB
          Mukul Kumar Singh
        24. HDDS-728.012.patch
          28 kB
          Mukul Kumar Singh
        25. HDDS-728-ozone-0.3.005.patch
          30 kB
          Mukul Kumar Singh
        26. om-audit-ctr-e138-1518143905142-541600-02-000002.hwx.site.log
          76 kB
          Soumitra Sulav

        Activity

          People

            msingh Mukul Kumar Singh
            ssulav Soumitra Sulav
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: