Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-5671

Fix socket leak in DFSInputStream#getBlockReader

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 2.2.0
    • 2.3.0
    • hdfs-client
    • None
    • Reviewed

    Description

      lsof -i TCP:1004 | grep -c CLOSE_WAIT
      18235
      When client request a file's block to DataNode:1004. If request fail because "java.io.IOException: Got error for OP_READ_BLOCK,Block token is expired." Occurs and the TCP socket that regionserver using is not closed.

      I think the problem above is in DatanodeInfo blockSeekTo(long target) of Class DFSInputStream
      The connection client using is BlockReader:
      blockReader = getBlockReader(targetAddr, chosenNode, src, blk,
      accessToken, offsetIntoBlock, blk.getNumBytes() - offsetIntoBlock,
      buffersize, verifyChecksum, dfsClient.clientName);

      In DFSInputStream.blockSeekTo()-line 533,invoke getBlockReader() which wil generate a peer use newTcpPeer(dnAddr) -line 1107,when BlockReaderFactory.newBlockReader throw IOException,the peer will not be closed which will cause a CLOSE_WAIT connection.
      In our test,when datanode get a InvalidToken exception in DataXceiver.checkAccess(),it will close the connection.At regionserver side, in RemoteBlockReader2.newBlockReader(),checkSuccess() will throw a InvalidBlockTokenException, DFSInputStream.blockSeekTo() will catch the exception, but the connection is NOT closed, it become CLOSE_WAIT.

      Attachments

        1. HDFS-5671.patch
          1 kB
          Uma Maheswara Rao G
        2. 5671v4.patch
          1 kB
          JamesLi
        3. 5671v3.patch
          1 kB
          JamesLi
        4. 5671v2.patch
          1 kB
          JamesLi
        5. 5671v1.patch
          1 kB
          JamesLi
        6. 5671.patch
          1 kB
          JamesLi

        Issue Links

          Activity

            People

              hbjoylee JamesLi
              hbjoylee JamesLi
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: