Hadoop Common
  1. Hadoop Common
  2. HADOOP-9576

Make NetUtils.wrapException throw EOFException instead of wrapping it as IOException

    Details

    • Type: Bug Bug
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 3.0.0, 1.2.1
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      In case of EOFException, NetUtils is now wrapping it as IOException, we may want to throw EOFException as it is, since EOFException can happen when connection is lost in the middle, the client may want to explicitly handle such exception

        Issue Links

          Activity

          Hide
          Steve Loughran added a comment -

          Makes sense -that wrapping was meant to cover the expected problems and make them easier to diagnose -swallowing the common exception types was never the goal.

          Show
          Steve Loughran added a comment - Makes sense -that wrapping was meant to cover the expected problems and make them easier to diagnose -swallowing the common exception types was never the goal.
          Hide
          Steve Loughran added a comment -

          I've seen this, and yes, it should be handled and marked as "connection torn town"

          2013-06-05 10:51:36,776 [AMRM Heartbeater thread] ERROR client.AMRMClientAsync (AMRMClientAsync.java:run(287)) - IO exception on heartbeat
          java.io.IOException: Failed on local exception: java.io.EOFException; Host Details : local host is: "stevel-2.local/192.168.1.85"; destination host is: "192.168.1.85":62370; 
          	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
          	at org.apache.hadoop.ipc.Client.call(Client.java:1307)
          	at org.apache.hadoop.ipc.Client.call(Client.java:1255)
          	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:204)
          	at com.sun.proxy.$Proxy9.allocate(Unknown Source)
          	at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
          	at org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:234)
          	at org.apache.hadoop.yarn.client.AMRMClientAsync$HeartbeatThread.run(AMRMClientAsync.java:279)
          Caused by: java.io.EOFException
          	at java.io.DataInputStream.readInt(DataInputStream.java:375)
          	at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:948)
          	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:845)
          

          We'll need a corresponding Wiki page along with the code patch

          Show
          Steve Loughran added a comment - I've seen this, and yes, it should be handled and marked as "connection torn town" 2013-06-05 10:51:36,776 [AMRM Heartbeater thread] ERROR client.AMRMClientAsync (AMRMClientAsync.java:run(287)) - IO exception on heartbeat java.io.IOException: Failed on local exception: java.io.EOFException; Host Details : local host is: "stevel-2.local/192.168.1.85" ; destination host is: "192.168.1.85" :62370; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764) at org.apache.hadoop.ipc.Client.call(Client.java:1307) at org.apache.hadoop.ipc.Client.call(Client.java:1255) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:204) at com.sun.proxy.$Proxy9.allocate(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75) at org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:234) at org.apache.hadoop.yarn.client.AMRMClientAsync$HeartbeatThread.run(AMRMClientAsync.java:279) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:948) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:845) We'll need a corresponding Wiki page along with the code patch
          Hide
          Steve Loughran added a comment -

          patch, also fixes a cut and paste text error in the NoRouteToHost handler

          Show
          Steve Loughran added a comment - patch, also fixes a cut and paste text error in the NoRouteToHost handler
          Hide
          Steve Loughran added a comment -

          Wiki Entry is at https://wiki.apache.org/hadoop/EOFException - I had to cover FileSystem EOFs too, as well as things like HTTP channels shutting down early

          This is what the stack trace looks like

          java.io.EOFException: EOFException from  local host is: "stevel-2.local/192.168.1.85"; destination host is: "192.168.1.85":63514; java.io.EOFException; For more details see:  http://wiki.apache.org/hadoop/EOFException
          	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
          	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
          	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
          	at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
          	at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:790)
          	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
          	at org.apache.hadoop.ipc.Client.call(Client.java:1307)
          	at org.apache.hadoop.ipc.Client.call(Client.java:1255)
          	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:204)
          	at com.sun.proxy.$Proxy9.allocate(Unknown Source)
          	at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
          	at org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:234)
          	at org.apache.hadoop.yarn.client.AMRMClientAsync$HeartbeatThread.run(AMRMClientAsync.java:279)
          Caused by: java.io.EOFException
          	at java.io.DataInputStream.readInt(DataInputStream.java:375)
          	at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:948)
          	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:845)
          
          Show
          Steve Loughran added a comment - Wiki Entry is at https://wiki.apache.org/hadoop/EOFException - I had to cover FileSystem EOFs too, as well as things like HTTP channels shutting down early This is what the stack trace looks like java.io.EOFException: EOFException from local host is: "stevel-2.local/192.168.1.85" ; destination host is: "192.168.1.85" :63514; java.io.EOFException; For more details see: http: //wiki.apache.org/hadoop/EOFException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:790) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764) at org.apache.hadoop.ipc.Client.call(Client.java:1307) at org.apache.hadoop.ipc.Client.call(Client.java:1255) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:204) at com.sun.proxy.$Proxy9.allocate(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75) at org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:234) at org.apache.hadoop.yarn.client.AMRMClientAsync$HeartbeatThread.run(AMRMClientAsync.java:279) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:948) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:845)
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12586310/HADOOP-9576-001.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/2600//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2600//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12586310/HADOOP-9576-001.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/2600//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2600//console This message is automatically generated.
          Hide
          Jian He added a comment -

          Steve, thanks for taking this over.
          Not related to this patch: there's already a getHostDetailsAsString method which is only used by wrapException(), can we make use of this method for all exception types for consistency.

          Show
          Jian He added a comment - Steve, thanks for taking this over. Not related to this patch: there's already a getHostDetailsAsString method which is only used by wrapException(), can we make use of this method for all exception types for consistency.

            People

            • Assignee:
              Steve Loughran
              Reporter:
              Jian He
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:

                Development