Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-7224

Allow reuse of NN connections via webhdfs

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.5.0
    • Fix Version/s: 2.7.0
    • Component/s: webhdfs
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      In very large clusters, the webhdfs client could get bind exceptions because it runs out of ephemeral
      ports. This could happen when using webhdfs to talk to the NN in order to do list globbing of a
      huge amount of files.

      WebHdfsFileSystem#jsonParse gets the input/error stream from the connection,
      but never closes the stream. Since it's not closed, the JVM thinks the stream may still
      be transferring data, so the next time through this code, it has to get a new connection
      rather than reusing an existing one.

      The lack of connection reuse has poor latency and adds too much overhead to the NN.

        Attachments

        1. HDFS-7224.v1.201410301923.txt
          4 kB
          Eric Payne
        2. HDFS-7224.v2.201410312033.txt
          5 kB
          Eric Payne
        3. HDFS-7224.v3.txt
          5 kB
          Eric Payne
        4. HDFS-7224.v4.txt
          5 kB
          Eric Payne

          Activity

            People

            • Assignee:
              epayne Eric Payne
              Reporter:
              epayne Eric Payne
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: