Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Key put failed for large file sizes
Error log:
ozone sh key put o3://ozone1723527225/vol-balancer-1723550072/buck-balancer-1723550072/cb_1723551201 /tmp/ozone_dir1723550065/cb_1723551201 --type=RATIS --replication=THREE 24/08/13 12:16:33 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 24/08/13 12:16:33 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig0- 24/08/13 12:16:33 INFO metrics.MetricRegistries: Loaded MetricRegistries class org.apache.ratis.metrics.dropwizard3.Dm3MetricRegistriesImpl 24/08/13 12:16:53 WARN grpc.GrpcUtil: Timed out gracefully shutting down connection: ManagedChannelOrphanWrapper{... 24/08/13 12:16:53 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 24/08/13 12:16:53 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig1- 24/08/13 12:17:02 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 24/08/13 12:17:02 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig2- 24/08/13 12:17:25 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 24/08/13 12:17:25 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig3- 24/08/13 12:17:36 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 24/08/13 12:17:36 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig4- 24/08/13 12:17:46 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 24/08/13 12:17:46 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig5- 24/08/13 12:17:57 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 24/08/13 12:17:57 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig6- 24/08/13 12:18:06 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 24/08/13 12:18:06 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig7- 24/08/13 12:18:19 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 24/08/13 12:18:19 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig8- 24/08/13 12:18:49 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 24/08/13 12:18:49 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig9- 24/08/13 12:19:13 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 24/08/13 12:19:13 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig10- 24/08/13 12:19:17 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 24/08/13 12:19:17 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig11- 24/08/13 12:19:37 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 24/08/13 12:19:37 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig12- 24/08/13 12:20:00 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 24/08/13 12:20:00 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig13- 24/08/13 12:20:11 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 24/08/13 12:20:11 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig14- 24/08/13 12:20:16 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 24/08/13 12:20:16 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig15- 24/08/13 12:20:27 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 24/08/13 12:20:27 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig16- 24/08/13 12:20:56 INFO scm.XceiverClientRatis: WatchType ALL_COMMITTED. Majority 2, 24/08/13 12:20:56 INFO netty.NettyConfigKeys$DataStream: setTlsConf GrpcTlsConfig17- 24/08/13 12:21:03 ERROR impl.OrderedAsync: Failed to send request, message=cmdType: WriteChunk traceID: "" containerID: 3 datanodeUuid: "8218b2c9-3b56-46e6-b23d-8e77e8a89174" writeChunk { blockID { containerID: 3 localID: 113750153625600043 blockCommitSequenceId: 2883 replicaIndex: 0 } chunkData { chunkName: "113750153625600043_chunk_57" offset: 234881024 len: 4194304 checksumData { type: CRC32 bytesPerChecksum: 16384 checksums: "c5\211\267" checksums: "\252\320y(" ... checksums: "\246L\317\027" checksums: "\332\r\264\a" } } } encodedToken: "..." version: 4 , data.size=4194304 java.util.concurrent.CompletionException: org.apache.ratis.protocol.exceptions.StateMachineException: org.apache.hadoop.hdds.scm.container.common.helpers.ContainerNotOpenException from Server 8218b2c9-3b56-46e6-b23d-8e77e8a89174@group-1D60427BAA40: Container 3 in CLOSED state at org.apache.ratis.client.impl.RaftClientImpl.handleRaftException(RaftClientImpl.java:373) at org.apache.ratis.client.impl.OrderedAsync.lambda$send$3(OrderedAsync.java:175) at org.apache.ratis.client.impl.OrderedAsync$PendingOrderedRequest.setReply(OrderedAsync.java:105) at org.apache.ratis.client.impl.OrderedAsync$PendingOrderedRequest.setReply(OrderedAsync.java:66) at org.apache.ratis.util.SlidingWindow$RequestMap.setReply(SlidingWindow.java:147) at org.apache.ratis.util.SlidingWindow$Client.receiveReply(SlidingWindow.java:351) at org.apache.ratis.client.impl.OrderedAsync.lambda$sendRequestWithRetry$5(OrderedAsync.java:210) at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.lambda$onNext$0(GrpcClientProtocolClient.java:322) at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.handleReplyFuture(GrpcClientProtocolClient.java:378) at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.access$100(GrpcClientProtocolClient.java:300) at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:322) at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:305) at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:468) at org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33) at org.apache.ratis.thirdparty.io.grpc.internal.DelayedClientCall$DelayedListener.onMessage(DelayedClientCall.java:473) at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInternal(ClientCallImpl.java:660) at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:647) at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133) Caused by: org.apache.ratis.protocol.exceptions.StateMachineException: org.apache.hadoop.hdds.scm.container.common.helpers.ContainerNotOpenException from Server 8218b2c9-3b56-46e6-b23d-8e77e8a89174@group-1D60427BAA40: Container 3 in CLOSED state at org.apache.ratis.server.impl.RaftServerImpl.writeAsyncImpl(RaftServerImpl.java:975) at org.apache.ratis.server.impl.RaftServerImpl.writeAsync(RaftServerImpl.java:945) at org.apache.ratis.server.impl.RaftServerImpl.replyFuture(RaftServerImpl.java:937) at org.apache.ratis.server.impl.RaftServerImpl.submitClientRequestAsync(RaftServerImpl.java:912) at org.apache.ratis.server.impl.RaftServerImpl.lambda$executeSubmitClientRequestAsync$11(RaftServerImpl.java:893) ... 3 more Caused by: org.apache.hadoop.hdds.scm.container.common.helpers.ContainerNotOpenException: Container 3 in CLOSED state at org.apache.ratis.util.ReflectionUtils.instantiateException(ReflectionUtils.java:259) at org.apache.ratis.client.impl.ClientProtoUtils.toStateMachineException(ClientProtoUtils.java:451) at org.apache.ratis.client.impl.ClientProtoUtils.toStateMachineException(ClientProtoUtils.java:437) at org.apache.ratis.client.impl.ClientProtoUtils.toRaftClientReply(ClientProtoUtils.java:404) at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:310) at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:305) at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:468) at org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33) at org.apache.ratis.thirdparty.io.grpc.internal.DelayedClientCall$DelayedListener.onMessage(DelayedClientCall.java:473) at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInternal(ClientCallImpl.java:660) at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:647) at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133) ... 3 more 24/08/13 12:21:03 ERROR impl.OrderedAsync: Failed to send request, message=cmdType: WriteChunk traceID: "" containerID: 3 datanodeUuid: "8218b2c9-3b56-46e6-b23d-8e77e8a89174" writeChunk { blockID { containerID: 3 localID: 113750153625600043 blockCommitSequenceId: 2883 replicaIndex: 0 } chunkData { chunkName: "113750153625600043_chunk_58" offset: 239075328 len: 4194304 checksumData { type: CRC32 bytesPerChecksum: 16384 checksums: "\267s\377\353" checksums: "\373>\370Y" ... checksums: "`-\274>" checksums: "\334\310\271\313" } } } encodedToken: "..." version: 4 , data.size=4194304 java.util.concurrent.CompletionException: org.apache.ratis.protocol.exceptions.StateMachineException: org.apache.hadoop.hdds.scm.container.common.helpers.ContainerNotOpenException from Server 8218b2c9-3b56-46e6-b23d-8e77e8a89174@group-1D60427BAA40: Container 3 in CLOSED state at org.apache.ratis.client.impl.RaftClientImpl.handleRaftException(RaftClientImpl.java:373) at org.apache.ratis.client.impl.OrderedAsync.lambda$send$3(OrderedAsync.java:175) at org.apache.ratis.client.impl.OrderedAsync$PendingOrderedRequest.setReply(OrderedAsync.java:105) at org.apache.ratis.client.impl.OrderedAsync$PendingOrderedRequest.setReply(OrderedAsync.java:66) at org.apache.ratis.util.SlidingWindow$RequestMap.setReply(SlidingWindow.java:147) at org.apache.ratis.util.SlidingWindow$Client.receiveReply(SlidingWindow.java:351) at org.apache.ratis.client.impl.OrderedAsync.lambda$sendRequestWithRetry$5(OrderedAsync.java:210) at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.lambda$onNext$0(GrpcClientProtocolClient.java:322) at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.handleReplyFuture(GrpcClientProtocolClient.java:378) at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.access$100(GrpcClientProtocolClient.java:300) at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:322) at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:305) at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:468) at org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33) at org.apache.ratis.thirdparty.io.grpc.internal.DelayedClientCall$DelayedListener.onMessage(DelayedClientCall.java:473) at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInternal(ClientCallImpl.java:660) at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:647) at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133) Caused by: org.apache.ratis.protocol.exceptions.StateMachineException: org.apache.hadoop.hdds.scm.container.common.helpers.ContainerNotOpenException from Server 8218b2c9-3b56-46e6-b23d-8e77e8a89174@group-1D60427BAA40: Container 3 in CLOSED state at org.apache.ratis.server.impl.RaftServerImpl.writeAsyncImpl(RaftServerImpl.java:975) at org.apache.ratis.server.impl.RaftServerImpl.writeAsync(RaftServerImpl.java:945) at org.apache.ratis.server.impl.RaftServerImpl.replyFuture(RaftServerImpl.java:937) at org.apache.ratis.server.impl.RaftServerImpl.submitClientRequestAsync(RaftServerImpl.java:912) at org.apache.ratis.server.impl.RaftServerImpl.lambda$executeSubmitClientRequestAsync$11(RaftServerImpl.java:893) ... 3 more Caused by: org.apache.hadoop.hdds.scm.container.common.helpers.ContainerNotOpenException: Container 3 in CLOSED state at org.apache.ratis.util.ReflectionUtils.instantiateException(ReflectionUtils.java:259) at org.apache.ratis.client.impl.ClientProtoUtils.toStateMachineException(ClientProtoUtils.java:451) at org.apache.ratis.client.impl.ClientProtoUtils.toStateMachineException(ClientProtoUtils.java:437) at org.apache.ratis.client.impl.ClientProtoUtils.toRaftClientReply(ClientProtoUtils.java:404) at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:310) at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:305) at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:468) at org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33) at org.apache.ratis.thirdparty.io.grpc.internal.DelayedClientCall$DelayedListener.onMessage(DelayedClientCall.java:473) at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInternal(ClientCallImpl.java:660) at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:647) at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133) ... 3 more ...
File size in test - 20GB
Attachments
Issue Links
- is fixed by
-
RATIS-2141 OOM for stateMachineCache use cases
- Resolved