Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6024

java.net.SocketTimeoutException in Fetcher caused jobs stuck for more than 1 hour

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • None
    • 2.6.0
    • mr-am, task
    • None
    • Reviewed

    Description

      2014-08-04 21:09:42,356 WARN fetcher#33 org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to fake.host.name:13562 with 2 map outputs
      java.net.SocketTimeoutException: Read timed out
      at java.net.SocketInputStream.socketRead0(Native Method)
      at java.net.SocketInputStream.read(SocketInputStream.java:129)
      at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
      at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
      at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
      at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:697)
      at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:640)
      at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1195)
      at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:289)
      at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
      2014-08-04 21:09:42,360 INFO fetcher#33 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: fake.host.name:13562 freed by fetcher#33 in 180024ms
      2014-08-04 21:09:55,360 INFO fetcher#33 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning fake.host.name:13562 with 3 to fetcher#33
      2014-08-04 21:09:55,360 INFO fetcher#33 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 3 of 3 to fake.host.name:13562 to fetcher#33
      2014-08-04 21:12:55,463 WARN fetcher#33 org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to fake.host.name:13562 with 3 map outputs
      java.net.SocketTimeoutException: Read timed out
      at java.net.SocketInputStream.socketRead0(Native Method)
      at java.net.SocketInputStream.read(SocketInputStream.java:129)
      at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
      at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
      at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
      at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:697)
      at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:640)
      at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1195)
      at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:289)
      at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
      ...
      2014-08-04 22:03:13,416 INFO fetcher#33 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: fake.host.name:13562 freed by fetcher#33 in 271081ms
      2014-08-04 22:04:13,417 INFO fetcher#33 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning fake.host.name:13562 with 3 to fetcher#33
      2014-08-04 22:04:13,417 INFO fetcher#33 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 3 of 3 to fake.host.name:13562 to fetcher#33
      2014-08-04 22:07:13,449 WARN fetcher#33 org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to fake.host.name:13562 with 3 map outputs
      java.net.SocketTimeoutException: Read timed out
      at java.net.SocketInputStream.socketRead0(Native Method)
      at java.net.SocketInputStream.read(SocketInputStream.java:129)
      at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
      at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
      at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
      at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:697)
      at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:640)
      at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1195)
      at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:289)
      at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)

      Attachments

        1. MAPREDUCE-6024.1.patch
          13 kB
          yunjiong zhao
        2. MAPREDUCE-6024.2.patch
          13 kB
          yunjiong zhao
        3. MAPREDUCE-6024.3.patch
          10 kB
          yunjiong zhao
        4. MAPREDUCE-6024.4.patch
          10 kB
          yunjiong zhao
        5. MAPREDUCE-6024.5.patch
          10 kB
          yunjiong zhao
        6. MAPREDUCE-6024.patch
          12 kB
          yunjiong zhao

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            zhaoyunjiong yunjiong zhao
            zhaoyunjiong yunjiong zhao
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment