Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-7392

org.apache.hadoop.hdfs.DistributedFileSystem open invalid URI forever

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • hdfs-client
    • None

    Description

      In some specific circumstances, org.apache.hadoop.hdfs.DistributedFileSystem.open(invalid URI) never timeouts and last forever.

      What are specific circumstances:
      1) HDFS URI (hdfs://share.example.com:8020/someDir/someFile.txt) should point to valid IP address but without name node service running on it.
      2) There should be at least 2 IP addresses for such a URI. See output below:

      /proj/quickbox$ nslookup share.example.com
      Server: 127.0.1.1
      Address: 127.0.1.1#53

      share.example.com canonical name = internal-realm-share-example-com-1234.us-east-1.elb.amazonaws.com.
      Name: internal-realm-share-example-com-1234.us-east-1.elb.amazonaws.com
      Address: 192.168.1.223
      Name: internal-realm-share-example-com-1234.us-east-1.elb.amazonaws.com
      Address: 192.168.1.65

      In such a case the org.apache.hadoop.ipc.Client.Connection.updateAddress() returns sometimes true (even if address didn't actually changed see img. 1) and the timeoutFailures counter is set to 0 (see img. 2). The maxRetriesOnSocketTimeouts (45) is never reached and connection attempt is repeated forever.

      Attachments

        1. HDFS-7392.diff
          2 kB
          Frantisek Vacek
        2. 2.png
          103 kB
          Frantisek Vacek
        3. 1.png
          132 kB
          Frantisek Vacek

        Activity

          People

            peschd Daniel Pesch
            vacekf Frantisek Vacek
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: