Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-728

Datanodes should use different ContainerStateMachine for each pipeline.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.3.0
    • 0.3.0, 0.4.0
    • Ozone Filesystem
    • None

    Description

      Setup a 5 datanode ozone cluster with HDP on top of it.

      After restarting all HDP services few times encountered below issue which is making the HDP services to fail.

      Same exception was observed in an old setup but I thought it could have been issue with the setup but now encountered the same issue in new setup as well.

      2018-10-24 10:42:03,308 WARN org.apache.ratis.grpc.server.GrpcServerProtocolService: 2974da2b-e765-43f9-8d30-45fe40dcb9ab: Failed requestVote 1672d28e-800f-4318-895b-1648976acff6->2974da2b-e765-43f9-8d30-45fe40dcb9ab#0
      org.apache.ratis.protocol.GroupMismatchException: 2974da2b-e765-43f9-8d30-45fe40dcb9ab: group-CE87A994686F not found.
      at org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114)
      at org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:252)
      at org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:261)
      at org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:256)
      at org.apache.ratis.server.impl.RaftServerProxy.requestVote(RaftServerProxy.java:411)
      at org.apache.ratis.grpc.server.GrpcServerProtocolService.requestVote(GrpcServerProtocolService.java:54)
      at org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$MethodHandlers.invoke(RaftServerProtocolServiceGrpc.java:319)
      at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
      at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
      at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707)
      at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
      at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)
      2018-10-24 10:42:03,342 WARN org.apache.ratis.grpc.server.GrpcServerProtocolService: 2974da2b-e765-43f9-8d30-45fe40dcb9ab: Failed requestVote 7839294e-5657-447f-b320-6b390fffb963->2974da2b-e765-43f9-8d30-45fe40dcb9ab#0
      org.apache.ratis.protocol.GroupMismatchException: 2974da2b-e765-43f9-8d30-45fe40dcb9ab: group-CE87A994686F not found.
      at org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114)
      at org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:252)
      at org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:261)
      at org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:256)
      at org.apache.ratis.server.impl.RaftServerProxy.requestVote(RaftServerProxy.java:411)
      at org.apache.ratis.grpc.server.GrpcServerProtocolService.requestVote(GrpcServerProtocolService.java:54)
      at org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$MethodHandlers.invoke(RaftServerProtocolServiceGrpc.java:319)
      at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
      at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
      at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707)
      at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
      at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)
      2018-10-24 10:42:04,466 WARN org.apache.ratis.grpc.server.GrpcServerProtocolService: 2974da2b-e765-43f9-8d30-45fe40dcb9ab: Failed requestVote 1672d28e-800f-4318-895b-1648976acff6->2974da2b-e765-43f9-8d30-45fe40dcb9ab#0
      org.apache.ratis.protocol.GroupMismatchException: 2974da2b-e765-43f9-8d30-45fe40dcb9ab: group-CE87A994686F not found.
      at org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114)
      at org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:252)
      at org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:261)
      at org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:256)
      at org.apache.ratis.server.impl.RaftServerProxy.requestVote(RaftServerProxy.java:411)
      at org.apache.ratis.grpc.server.GrpcServerProtocolService.requestVote(GrpcServerProtocolService.java:54)
      at org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$MethodHandlers.invoke(RaftServerProtocolServiceGrpc.java:319)
      at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
      at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
      at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707)
      at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
      at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)
      

      Attachments

        1. om-audit-ctr-e138-1518143905142-541600-02-000002.hwx.site.log
          76 kB
          Soumitra Sulav
        2. HDDS-728-ozone-0.3.005.patch
          30 kB
          Mukul Kumar Singh
        3. HDDS-728.012.patch
          28 kB
          Mukul Kumar Singh
        4. HDDS-728.011.patch
          25 kB
          Mukul Kumar Singh
        5. HDDS-728.010.patch
          25 kB
          Mukul Kumar Singh
        6. HDDS-728.009.patch
          25 kB
          Mukul Kumar Singh
        7. HDDS-728.008.patch
          24 kB
          Mukul Kumar Singh
        8. HDDS-728.007.patch
          24 kB
          Mukul Kumar Singh
        9. HDDS-728.006.patch
          26 kB
          Mukul Kumar Singh
        10. HDDS-728.005.patch
          26 kB
          Mukul Kumar Singh
        11. HDDS-728.004.patch
          22 kB
          Mukul Kumar Singh
        12. HDDS-728.003.patch
          21 kB
          Mukul Kumar Singh
        13. HDDS-728.002.patch
          20 kB
          Mukul Kumar Singh
        14. HDDS-728.001.patch
          20 kB
          Mukul Kumar Singh
        15. hadoop-root-scm-ctr-e138-1518143905142-541600-02-000002.hwx.site.log
          2.00 MB
          Soumitra Sulav
        16. hadoop-root-om-ctr-e138-1518143905142-541600-02-000002.hwx.site.log
          31 kB
          Soumitra Sulav
        17. hadoop-root-datanode-ctr-e138-1518143905142-552728-01-000008.hwx.site.log
          16.68 MB
          Soumitra Sulav
        18. hadoop-root-datanode-ctr-e138-1518143905142-552728-01-000007.hwx.site.log
          3.02 MB
          Soumitra Sulav
        19. hadoop-root-datanode-ctr-e138-1518143905142-552728-01-000006.hwx.site.log
          14.42 MB
          Soumitra Sulav
        20. hadoop-root-datanode-ctr-e138-1518143905142-552728-01-000005.hwx.site.log
          3.73 MB
          Soumitra Sulav
        21. hadoop-root-datanode-ctr-e138-1518143905142-552728-01-000004.hwx.site.log
          1.94 MB
          Soumitra Sulav
        22. hadoop-root-datanode-ctr-e138-1518143905142-541600-02-000010.hwx.site.log
          2.31 MB
          Soumitra Sulav
        23. hadoop-root-datanode-ctr-e138-1518143905142-541600-02-000009.hwx.site.log
          2.22 MB
          Soumitra Sulav
        24. hadoop-root-datanode-ctr-e138-1518143905142-541600-02-000008.hwx.site.log
          1.97 MB
          Soumitra Sulav
        25. hadoop-root-datanode-ctr-e138-1518143905142-541600-02-000003.hwx.site.log
          1.11 MB
          Soumitra Sulav
        26. hadoop-root-datanode-ctr-e138-1518143905142-541600-02-000002.hwx.site.log
          468 kB
          Soumitra Sulav

        Activity

          People

            msingh Mukul Kumar Singh
            ssulav Soumitra Sulav
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: