Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-6448

BlockReaderLocalLegacy should set socket timeout based on conf.socketTimeout

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0, 2.4.0
    • Fix Version/s: 2.5.0
    • Component/s: hdfs-client
    • Labels:
      None
    • Target Version/s:

      Description

      Our hbase deployed upon hadoop2.0, in one accident, we hit HDFS-5016 in HDFS side, but we also found from HBase side, the dfs client was hung at getBlockReader, after reading code, we found there is a timeout setting in current codebase though, but the default hdfsTimeout value is "-1" ( from Client.java:getTimeout(conf) )which means no timeout...

      The hung stack trace like following:
      at $Proxy21.getBlockLocalPathInfo(Unknown Source)
      at org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolTranslatorPB.getBlockLocalPathInfo(ClientDatanodeProtocolTranslatorPB.java:215)
      at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:267)
      at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:180)
      at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:812)

      One feasible fix is replacing the hdfsTimeout with socketTimeout. see attached patch. Most of credit should give Liu Shaohui

      1. HDFS-6448.txt
        0.7 kB
        Liang Xie

        Activity

          People

          • Assignee:
            Liang Xie
            Reporter:
            Liang Xie
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development