Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-6448

BlockReaderLocalLegacy should set socket timeout based on conf.socketTimeout

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.4.0, 3.0.0-alpha1
    • Fix Version/s: 2.5.0
    • Component/s: hdfs-client
    • Labels:
      None
    • Target Version/s:

      Description

      Our hbase deployed upon hadoop2.0, in one accident, we hit HDFS-5016 in HDFS side, but we also found from HBase side, the dfs client was hung at getBlockReader, after reading code, we found there is a timeout setting in current codebase though, but the default hdfsTimeout value is "-1" ( from Client.java:getTimeout(conf) )which means no timeout...

      The hung stack trace like following:
      at $Proxy21.getBlockLocalPathInfo(Unknown Source)
      at org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolTranslatorPB.getBlockLocalPathInfo(ClientDatanodeProtocolTranslatorPB.java:215)
      at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:267)
      at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:180)
      at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:812)

      One feasible fix is replacing the hdfsTimeout with socketTimeout. see attached patch. Most of credit should give Liu Shaohui

        Attachments

          Activity

            People

            • Assignee:
              xieliang007 Liang Xie
              Reporter:
              xieliang007 Liang Xie
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: