Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-11317

Key put failed for large file sizes

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Ozone CLI
    • None

    Description

      Key put failed for large file sizes

      Error log:

      ozone sh key put o3://ozone1723527225/vol-balancer-1723550072/buck-balancer-1723550072/cb_1723551201 /tmp/ozone_dir1723550065/cb_1723551201 --type=RATIS --replication=THREE
      
      24/08/13 12:16:33 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 
      24/08/13 12:16:33 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig0-
      24/08/13 12:16:33 INFO metrics.MetricRegistries: Loaded MetricRegistries class org.apache.ratis.metrics.dropwizard3.Dm3MetricRegistriesImpl
      24/08/13 12:16:53 WARN grpc.GrpcUtil: Timed out gracefully shutting down connection: ManagedChannelOrphanWrapper{...
      24/08/13 12:16:53 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 
      24/08/13 12:16:53 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig1-
      24/08/13 12:17:02 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 
      24/08/13 12:17:02 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig2-
      24/08/13 12:17:25 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 
      24/08/13 12:17:25 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig3-
      24/08/13 12:17:36 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 
      24/08/13 12:17:36 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig4-
      24/08/13 12:17:46 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 
      24/08/13 12:17:46 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig5-
      24/08/13 12:17:57 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 
      24/08/13 12:17:57 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig6-
      24/08/13 12:18:06 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 
      24/08/13 12:18:06 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig7-
      24/08/13 12:18:19 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 
      24/08/13 12:18:19 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig8-
      24/08/13 12:18:49 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 
      24/08/13 12:18:49 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig9-
      24/08/13 12:19:13 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 
      24/08/13 12:19:13 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig10-
      24/08/13 12:19:17 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 
      24/08/13 12:19:17 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig11-
      24/08/13 12:19:37 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 
      24/08/13 12:19:37 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig12-
      24/08/13 12:20:00 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 
      24/08/13 12:20:00 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig13-
      24/08/13 12:20:11 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 
      24/08/13 12:20:11 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig14-
      24/08/13 12:20:16 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 
      24/08/13 12:20:16 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig15-
      24/08/13 12:20:27 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 
      24/08/13 12:20:27 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig16-
      24/08/13 12:20:56 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 
      24/08/13 12:20:56 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig17-
      24/08/13 12:21:03 ERROR impl.OrderedAsync: Failed to send request, message=cmdType: WriteChunk
      traceID: ""
      containerID: 3
      datanodeUuid: "8218b2c9-3b56-46e6-b23d-8e77e8a89174"
      writeChunk {
        blockID {
          containerID: 3
          localID: 113750153625600043
          blockCommitSequenceId: 2883
          replicaIndex: 0
        }
        chunkData {
          chunkName: "113750153625600043_chunk_57"
          offset: 234881024
          len: 4194304
          checksumData {
            type: CRC32
            bytesPerChecksum: 16384
            checksums: "c5\211\267"
            checksums: "\252\320y("
            ...
            checksums: "\246L\317\027"
            checksums: "\332\r\264\a"
          }
        }
      }
      encodedToken: "..."
      version: 4
      , data.size=4194304
      java.util.concurrent.CompletionException: org.apache.ratis.protocol.exceptions.StateMachineException: org.apache.hadoop.hdds.scm.container.common.helpers.ContainerNotOpenException from Server 8218b2c9-3b56-46e6-b23d-8e77e8a89174@group-1D60427BAA40: Container 3 in CLOSED state
      	at org.apache.ratis.client.impl.RaftClientImpl.handleRaftException(RaftClientImpl.java:373)
      	at org.apache.ratis.client.impl.OrderedAsync.lambda$send$3(OrderedAsync.java:175)
      	at org.apache.ratis.client.impl.OrderedAsync$PendingOrderedRequest.setReply(OrderedAsync.java:105)
      	at org.apache.ratis.client.impl.OrderedAsync$PendingOrderedRequest.setReply(OrderedAsync.java:66)
      	at org.apache.ratis.util.SlidingWindow$RequestMap.setReply(SlidingWindow.java:147)
      	at org.apache.ratis.util.SlidingWindow$Client.receiveReply(SlidingWindow.java:351)
      	at org.apache.ratis.client.impl.OrderedAsync.lambda$sendRequestWithRetry$5(OrderedAsync.java:210)
      	at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.lambda$onNext$0(GrpcClientProtocolClient.java:322)
      	at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.handleReplyFuture(GrpcClientProtocolClient.java:378)
      	at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.access$100(GrpcClientProtocolClient.java:300)
      	at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:322)
      	at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:305)
      	at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:468)
      	at org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33)
      	at org.apache.ratis.thirdparty.io.grpc.internal.DelayedClientCall$DelayedListener.onMessage(DelayedClientCall.java:473)
      	at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInternal(ClientCallImpl.java:660)
      	at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:647)
      	at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
      	at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
      Caused by: org.apache.ratis.protocol.exceptions.StateMachineException: org.apache.hadoop.hdds.scm.container.common.helpers.ContainerNotOpenException from Server 8218b2c9-3b56-46e6-b23d-8e77e8a89174@group-1D60427BAA40: Container 3 in CLOSED state
      	at org.apache.ratis.server.impl.RaftServerImpl.writeAsyncImpl(RaftServerImpl.java:975)
      	at org.apache.ratis.server.impl.RaftServerImpl.writeAsync(RaftServerImpl.java:945)
      	at org.apache.ratis.server.impl.RaftServerImpl.replyFuture(RaftServerImpl.java:937)
      	at org.apache.ratis.server.impl.RaftServerImpl.submitClientRequestAsync(RaftServerImpl.java:912)
      	at org.apache.ratis.server.impl.RaftServerImpl.lambda$executeSubmitClientRequestAsync$11(RaftServerImpl.java:893)
      	... 3 more
      Caused by: org.apache.hadoop.hdds.scm.container.common.helpers.ContainerNotOpenException: Container 3 in CLOSED state
      	at org.apache.ratis.util.ReflectionUtils.instantiateException(ReflectionUtils.java:259)
      	at org.apache.ratis.client.impl.ClientProtoUtils.toStateMachineException(ClientProtoUtils.java:451)
      	at org.apache.ratis.client.impl.ClientProtoUtils.toStateMachineException(ClientProtoUtils.java:437)
      	at org.apache.ratis.client.impl.ClientProtoUtils.toRaftClientReply(ClientProtoUtils.java:404)
      	at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:310)
      	at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:305)
      	at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:468)
      	at org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33)
      	at org.apache.ratis.thirdparty.io.grpc.internal.DelayedClientCall$DelayedListener.onMessage(DelayedClientCall.java:473)
      	at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInternal(ClientCallImpl.java:660)
      	at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:647)
      	at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
      	at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
      	... 3 more
      24/08/13 12:21:03 ERROR impl.OrderedAsync: Failed to send request, message=cmdType: WriteChunk
      traceID: ""
      containerID: 3
      datanodeUuid: "8218b2c9-3b56-46e6-b23d-8e77e8a89174"
      writeChunk {
        blockID {
          containerID: 3
          localID: 113750153625600043
          blockCommitSequenceId: 2883
          replicaIndex: 0
        }
        chunkData {
          chunkName: "113750153625600043_chunk_58"
          offset: 239075328
          len: 4194304
          checksumData {
            type: CRC32
            bytesPerChecksum: 16384
            checksums: "\267s\377\353"
            checksums: "\373>\370Y"
            ...
            checksums: "`-\274>"
            checksums: "\334\310\271\313"
          }
        }
      }
      encodedToken: "..."
      version: 4
      , data.size=4194304
      java.util.concurrent.CompletionException: org.apache.ratis.protocol.exceptions.StateMachineException: org.apache.hadoop.hdds.scm.container.common.helpers.ContainerNotOpenException from Server 8218b2c9-3b56-46e6-b23d-8e77e8a89174@group-1D60427BAA40: Container 3 in CLOSED state
      	at org.apache.ratis.client.impl.RaftClientImpl.handleRaftException(RaftClientImpl.java:373)
      	at org.apache.ratis.client.impl.OrderedAsync.lambda$send$3(OrderedAsync.java:175)
      	at org.apache.ratis.client.impl.OrderedAsync$PendingOrderedRequest.setReply(OrderedAsync.java:105)
      	at org.apache.ratis.client.impl.OrderedAsync$PendingOrderedRequest.setReply(OrderedAsync.java:66)
      	at org.apache.ratis.util.SlidingWindow$RequestMap.setReply(SlidingWindow.java:147)
      	at org.apache.ratis.util.SlidingWindow$Client.receiveReply(SlidingWindow.java:351)
      	at org.apache.ratis.client.impl.OrderedAsync.lambda$sendRequestWithRetry$5(OrderedAsync.java:210)
      	at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.lambda$onNext$0(GrpcClientProtocolClient.java:322)
      	at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.handleReplyFuture(GrpcClientProtocolClient.java:378)
      	at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.access$100(GrpcClientProtocolClient.java:300)
      	at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:322)
      	at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:305)
      	at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:468)
      	at org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33)
      	at org.apache.ratis.thirdparty.io.grpc.internal.DelayedClientCall$DelayedListener.onMessage(DelayedClientCall.java:473)
      	at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInternal(ClientCallImpl.java:660)
      	at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:647)
      	at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
      	at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
      Caused by: org.apache.ratis.protocol.exceptions.StateMachineException: org.apache.hadoop.hdds.scm.container.common.helpers.ContainerNotOpenException from Server 8218b2c9-3b56-46e6-b23d-8e77e8a89174@group-1D60427BAA40: Container 3 in CLOSED state
      	at org.apache.ratis.server.impl.RaftServerImpl.writeAsyncImpl(RaftServerImpl.java:975)
      	at org.apache.ratis.server.impl.RaftServerImpl.writeAsync(RaftServerImpl.java:945)
      	at org.apache.ratis.server.impl.RaftServerImpl.replyFuture(RaftServerImpl.java:937)
      	at org.apache.ratis.server.impl.RaftServerImpl.submitClientRequestAsync(RaftServerImpl.java:912)
      	at org.apache.ratis.server.impl.RaftServerImpl.lambda$executeSubmitClientRequestAsync$11(RaftServerImpl.java:893)
      	... 3 more
      Caused by: org.apache.hadoop.hdds.scm.container.common.helpers.ContainerNotOpenException: Container 3 in CLOSED state
      	at org.apache.ratis.util.ReflectionUtils.instantiateException(ReflectionUtils.java:259)
      	at org.apache.ratis.client.impl.ClientProtoUtils.toStateMachineException(ClientProtoUtils.java:451)
      	at org.apache.ratis.client.impl.ClientProtoUtils.toStateMachineException(ClientProtoUtils.java:437)
      	at org.apache.ratis.client.impl.ClientProtoUtils.toRaftClientReply(ClientProtoUtils.java:404)
      	at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:310)
      	at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:305)
      	at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:468)
      	at org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33)
      	at org.apache.ratis.thirdparty.io.grpc.internal.DelayedClientCall$DelayedListener.onMessage(DelayedClientCall.java:473)
      	at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInternal(ClientCallImpl.java:660)
      	at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:647)
      	at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
      	at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
      	... 3 more
      ...
      

      File size in test - 20GB

      Attachments

        Issue Links

          Activity

            People

              Sammi Sammi Chen
              jyosin Jyotirmoy Sinha
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: