Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-6448

BlockReaderLocalLegacy should set socket timeout based on conf.socketTimeout

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.4.0, 3.0.0-alpha1
    • 2.5.0
    • hdfs-client
    • None

    Description

      Our hbase deployed upon hadoop2.0, in one accident, we hit HDFS-5016 in HDFS side, but we also found from HBase side, the dfs client was hung at getBlockReader, after reading code, we found there is a timeout setting in current codebase though, but the default hdfsTimeout value is "-1" ( from Client.java:getTimeout(conf) )which means no timeout...

      The hung stack trace like following:
      at $Proxy21.getBlockLocalPathInfo(Unknown Source)
      at org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolTranslatorPB.getBlockLocalPathInfo(ClientDatanodeProtocolTranslatorPB.java:215)
      at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:267)
      at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:180)
      at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:812)

      One feasible fix is replacing the hdfsTimeout with socketTimeout. see attached patch. Most of credit should give liushaohui

      Attachments

        1. HDFS-6448.txt
          0.7 kB
          Liang Xie

        Activity

          People

            xieliang007 Liang Xie
            xieliang007 Liang Xie
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: