Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.5.0, 3.0.0-alpha1
    • Fix Version/s: 2.6.0
    • Component/s: hdfs-client
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      Input streams lost their timeout. The problem appears to be DFSClient#newConnectedPeer does not set the read timeout. During a temporary network interruption the server will close the socket, unbeknownst to the client host, which blocks on a read forever.

      The results are dire. Services such as the RM, JHS, NMs, oozie servers, etc all need to be restarted to recover - unless you want to wait many hours for the tcp stack keepalive to detect the broken socket.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                daryn Daryn Sharp
                Reporter:
                daryn Daryn Sharp
              • Votes:
                0 Vote for this issue
                Watchers:
                21 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: