Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Duplicate
-
1.2.0
-
None
-
None
Description
Recon checks and adds a container from SCM whenever it sees it for the first time. When there are a lot of new containers for Recon to consume due to it being down for a long time, then this report processing can hang on the RPC call, or even worse cause more bottleneck issues if SCM is down.
EventQueue-ContainerReportForReconContainerReportHandler PRIORITY : 5 THREAD ID : 0X00007F2A6DDC3000 NATIVE ID : 0XD324 NATIVE ID (DECIMAL) : 54052 STATE : BLOCKED stackTrace: java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.ipc.Client$Connection.addCall(Client.java:521) - waiting to lock <0x00007f1a70482730> (a org.apache.hadoop.ipc.Client$Connection) at org.apache.hadoop.ipc.Client$Connection.access$3700(Client.java:413) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1623) at org.apache.hadoop.ipc.Client.call(Client.java:1452) at org.apache.hadoop.ipc.Client.call(Client.java:1405) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) at com.sun.proxy.$Proxy41.submitRequest(Unknown Source) at jdk.internal.reflect.GeneratedMethodAccessor38.invoke(Unknown Source) at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.5/DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(java.base@11.0.5/Method.java:566) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96) - locked <0x00007f1a6ca20ad8> (a org.apache.hadoop.io.retry.RetryInvocationHandler$Call) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362) at com.sun.proxy.$Proxy41.submitRequest(Unknown Source) at org.apache.hadoop.hdds.scm.protocolPB.StorageContainerLocationProtocolClientSideTranslatorPB.submitRpcRequest(StorageContainerLocationProtocolClientSideTranslatorPB.java:154) at org.apache.hadoop.hdds.scm.protocolPB.StorageContainerLocationProtocolClientSideTranslatorPB.submitRequest(StorageContainerLocationProtocolClientSideTranslatorPB.java:144) at org.apache.hadoop.hdds.scm.protocolPB.StorageContainerLocationProtocolClientSideTranslatorPB.getContainerWithPipeline(StorageContainerLocationProtocolClientSideTranslatorPB.java:230) at org.apache.hadoop.ozone.recon.spi.impl.StorageContainerServiceProviderImpl.getContainerWithPipeline(StorageContainerServiceProviderImpl.java:63) at org.apache.hadoop.ozone.recon.scm.ReconContainerManager.checkAndAddNewContainer(ReconContainerManager.java:122) at org.apache.hadoop.ozone.recon.scm.ReconContainerReportHandler.onMessage(ReconContainerReportHandler.java:62) at org.apache.hadoop.ozone.recon.scm.ReconContainerReportHandler.onMessage(ReconContainerReportHandler.java:38) at org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81) at org.apache.hadoop.hdds.server.events.SingleThreadExecutor$$Lambda$405/0x00007f19c2857d08.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.5/ThreadPoolExecutor.java:1128) at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.5/ThreadPoolExecutor.java:628) at java.lang.Thread.run(java.base@11.0.5/Thread.java:834)
Attachments
Issue Links
- is fixed by
-
HDDS-6974 Container report processing in Recon is single threaded
- Resolved