Uploaded image for project: 'Hadoop Distributed Data Store'
  1. Hadoop Distributed Data Store
  2. HDDS-1700

RPC Payload too large on datanode startup in kubernetes

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Cannot Reproduce
    • Affects Version/s: 0.4.0
    • Fix Version/s: None
    • Component/s: docker, Ozone Datanode, SCM
    • Labels:
      None
    • Environment:

      Description

      When starting the datanode on a seperate kubernetes pod than the SCM and OM, the below error appears in the datanode's ozone.log. We verified basic connectivity between the datanode pod and the OM/SCM pod.

      2019-06-17 17:14:16,449 [Datanode State Machine Thread - 0] ERROR (EndpointStateMachine.java:207) - Unable to communicate to SCM server at ozone-managers-service:9876 for past 31800 seconds.
      java.io.IOException: Failed on local exception: org.apache.hadoop.ipc.RpcException: RPC response exceeds maximum data length; Host Details : local host is: "ozone-datanode/10.244.84.187"; destination host is: "ozone-managers-service":9876;
      at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:816)
      at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1515)
      at org.apache.hadoop.ipc.Client.call(Client.java:1457)
      at org.apache.hadoop.ipc.Client.call(Client.java:1367)
      at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
      at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
      at com.sun.proxy.$Proxy88.getVersion(Unknown Source)
      at org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolClientSideTranslatorPB.getVersion(StorageContainerDatanodeProtocolClientSideTranslatorPB.java:112)
      at org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:70)
      at org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:42)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)
      Caused by: org.apache.hadoop.ipc.RpcException: RPC response exceeds maximum data length
      at org.apache.hadoop.ipc.Client$IpcStreams.readResponse(Client.java:1830)
      at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1173)
      at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1069)

       

      cc Salvatore LaMendola

      [~anu]

      Marton Elek

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              Joshssiegel Josh Siegel
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: