Hadoop Common
  1. Hadoop Common
  2. HADOOP-5348

Create a ThrowableWritable for serializing exceptions robustly

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 0.21.0
    • Fix Version/s: None
    • Component/s: ipc
    • Labels:
      None

      Description

      HADOOP-5201 and other issues would benefit from a stable representation of exceptions, one that can be sent over the network, maybe pushed out to web UIs and which we can be 100% sure that the far end will be able to handle if they have the hadoop-core JAR on their classpath.

      1. ThrowableWritable.java
        4 kB
        Steve Loughran
      2. HADOOP-5348-2.patch
        16 kB
        Steve Loughran
      3. hadoop-5348.patch
        13 kB
        Steve Loughran

        Issue Links

          Activity

          Hide
          Steve Loughran added a comment -

          Proposed features

          -something to take an exception and turn it (and any nested exceptions) into the wire format
          -have a constructor that doesn't generate a stack trace, for lower cost construction of the exceptions
          -retain things like classname and stack trace when creating the wire format
          -ability for the far end to decode the exception into a class that clients with hadoop-core.jar will be guaranteed to have
          -Utility code to push out into a human readable form in web interface
          -possibly: a way to add a stable fault-code string that could be used for client side tests that was not as brittle as searching for text strings

          Non-goals
          -seamless conversion back into a throwable chain on the client side. We are not trying to do RMI here
          -all the complexity of SOAP1.2 faults with actors, chained subfault-codes, etc. . We are not trying to do SOAP or JAX-WS here

          Show
          Steve Loughran added a comment - Proposed features -something to take an exception and turn it (and any nested exceptions) into the wire format -have a constructor that doesn't generate a stack trace, for lower cost construction of the exceptions -retain things like classname and stack trace when creating the wire format -ability for the far end to decode the exception into a class that clients with hadoop-core.jar will be guaranteed to have -Utility code to push out into a human readable form in web interface -possibly: a way to add a stable fault-code string that could be used for client side tests that was not as brittle as searching for text strings Non-goals -seamless conversion back into a throwable chain on the client side. We are not trying to do RMI here -all the complexity of SOAP1.2 faults with actors, chained subfault-codes, etc. . We are not trying to do SOAP or JAX-WS here
          Hide
          Steve Loughran added a comment -

          This could also be used in the ping operations proposed in HADOOP-3628; these objects would be lighter weight than a list of exceptions if there was no need to create stack traces, but easy to marshall

          Show
          Steve Loughran added a comment - This could also be used in the ping operations proposed in HADOOP-3628 ; these objects would be lighter weight than a list of exceptions if there was no need to create stack traces, but easy to marshall
          Hide
          Steve Loughran added a comment -

          First pass at this:

          1. you can create it from a chain of throwables, but you can't go the other way.
          2. Stack trace gets downgraded to string array
          3. No tests that conversion works or serialize/deserialize reliable
          Show
          Steve Loughran added a comment - First pass at this: you can create it from a chain of throwables, but you can't go the other way. Stack trace gets downgraded to string array No tests that conversion works or serialize/deserialize reliable
          Hide
          Steve Loughran added a comment -

          Needs of HADOOP-5201

          1. Ability to construct new StackTraceElement lists from stack traces (i,e. do not downconvert to text)
          2. ability to create a new exception from a deserialized exception, with toString(), stack trace &c recreating the details received
          3. Option of using error codes
          Show
          Steve Loughran added a comment - Needs of HADOOP-5201 Ability to construct new StackTraceElement lists from stack traces (i,e. do not downconvert to text) ability to create a new exception from a deserialized exception, with toString(), stack trace &c recreating the details received Option of using error codes
          Hide
          Steve Loughran added a comment -

          Some sample exception classes, mostly soapfault related, that so similarish things

          Axis 1.x AxisFault
          http://svn.apache.org/viewvc/webservices/axis/trunk/java/src/org/apache/axis/AxisFault.java?view=markup

          • lots of complexity related to XML fragments, parsing, namespaces -stuff to avoid.
          • automatically adds hostname as one of the elements, very good for tracking down trouble in a cluster
          • Security: the normal Axis servlet strips out the stack before sending the fault over the wire, unless the machine has a "development box" switch set.

          Axis 2 AxisFault
          http://svn.apache.org/viewvc/webservices/axis2/trunk/java/modules/kernel/src/org/apache/axis2/AxisFault.java?view=markup

          • lots of SOAP1.2 complexity
          • extracts the inner cause of a InvocationTargetException or UndeclaredThrowableException and discards the wrapper
          • Security: has methods used to print the fault in an HTML page, methods that sanitize all strings by escaping angle brackets

          Alpine faulting
          http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/projects/alpine/prototype/M32/src/java/org/smartfrog/projects/alpine/faults/

          • Lots of SOAP1.1/1.2 complexity
          • has an interface, SoapFaultSource, that you can implement in any subclass of Throwable to say "I'll handle the fault to SOAP conversion".
          • does some escaping of all received details, in case a malicious far end decides to send back malicious html as the fault name
          • Can back-convert to an exception, but never tries to do the original (Because of this [www.hpl.hp.com/techreports/2005/HPL-2005-83.pdf] )

          All of these suffer from the complexity of SOAP stacks, and the SOAP1.2 fault design in particular. what may be relevant is

          1. Hostnames are invaluable to retain, to add to new exceptions
          2. Being able to add name, value data is good, but full XML is far too much. String->string works well
          3. It's nice to have a client/server flag
          4. We may need to think of security, at least the use case of converting an exception into HTML to display on a report page.
          Show
          Steve Loughran added a comment - Some sample exception classes, mostly soapfault related, that so similarish things Axis 1.x AxisFault http://svn.apache.org/viewvc/webservices/axis/trunk/java/src/org/apache/axis/AxisFault.java?view=markup lots of complexity related to XML fragments, parsing, namespaces -stuff to avoid. automatically adds hostname as one of the elements, very good for tracking down trouble in a cluster Security: the normal Axis servlet strips out the stack before sending the fault over the wire, unless the machine has a "development box" switch set. Axis 2 AxisFault http://svn.apache.org/viewvc/webservices/axis2/trunk/java/modules/kernel/src/org/apache/axis2/AxisFault.java?view=markup lots of SOAP1.2 complexity extracts the inner cause of a InvocationTargetException or UndeclaredThrowableException and discards the wrapper Security: has methods used to print the fault in an HTML page, methods that sanitize all strings by escaping angle brackets Alpine faulting http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/projects/alpine/prototype/M32/src/java/org/smartfrog/projects/alpine/faults/ Lots of SOAP1.1/1.2 complexity has an interface, SoapFaultSource , that you can implement in any subclass of Throwable to say "I'll handle the fault to SOAP conversion". does some escaping of all received details, in case a malicious far end decides to send back malicious html as the fault name Can back-convert to an exception, but never tries to do the original (Because of this [www.hpl.hp.com/techreports/2005/HPL-2005-83.pdf] ) All of these suffer from the complexity of SOAP stacks, and the SOAP1.2 fault design in particular. what may be relevant is Hostnames are invaluable to retain, to add to new exceptions Being able to add name, value data is good, but full XML is far too much. String->string works well It's nice to have a client/server flag We may need to think of security, at least the use case of converting an exception into HTML to display on a report page.
          Hide
          Steve Loughran added a comment -

          This is an updated patch with test case. It does not attempt to go from serialized form to exceptions, only from exceptions to wire form. The recipient is left to deal with it in this format.

          You do not need to have an exception to create one of these objects -creating one with just a message is much lower cost, as stack tracing is expensive in a JVM.

          Show
          Steve Loughran added a comment - This is an updated patch with test case. It does not attempt to go from serialized form to exceptions, only from exceptions to wire form. The recipient is left to deal with it in this format. You do not need to have an exception to create one of these objects -creating one with just a message is much lower cost, as stack tracing is expensive in a JVM.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12404535/hadoop-5348.patch
          against trunk revision 761632.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/109/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/109/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/109/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/109/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12404535/hadoop-5348.patch against trunk revision 761632. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/109/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/109/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/109/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/109/console This message is automatically generated.
          Hide
          Chris Douglas added a comment -

          I think the plan of record is to standardize on Avro for Hadoop's wire format.

          Show
          Chris Douglas added a comment - I think the plan of record is to standardize on Avro for Hadoop's wire format.
          Hide
          Steve Loughran added a comment -

          Maybe the term "wire format" is wrong; I should retitle this. Its about a stable Writable for exceptions that doesn't rely on the far end to have every exception in the Java chain. Rather than serialise the throwable and hope the far end has its classpath in perfect sync, right down to JDBC drivers, this turns the exception chain into a chain of ThrowableWritables, which can then be written and read.

          We could add Avro support to this when the time comes.

          Show
          Steve Loughran added a comment - Maybe the term "wire format" is wrong; I should retitle this. Its about a stable Writable for exceptions that doesn't rely on the far end to have every exception in the Java chain. Rather than serialise the throwable and hope the far end has its classpath in perfect sync, right down to JDBC drivers, this turns the exception chain into a chain of ThrowableWritables, which can then be written and read. We could add Avro support to this when the time comes.
          Hide
          Chris Douglas added a comment -

          OK, I think I understand. Unfortunately, this issue proposes to add the type without any use cases in the framework. Absent context for this, it's hard to justify adding it to core on its own. If HADOOP-3628 can use it effectively (the ping functionality is a good idea), it would make more sense to add it as part of that work than to add it as an orphaned type.

          Show
          Chris Douglas added a comment - OK, I think I understand. Unfortunately, this issue proposes to add the type without any use cases in the framework. Absent context for this, it's hard to justify adding it to core on its own. If HADOOP-3628 can use it effectively (the ping functionality is a good idea), it would make more sense to add it as part of that work than to add it as an orphaned type.
          Hide
          Steve Loughran added a comment -

          It's certainly part of HADOOP-3628, and is on that branch, but my little HADOOP-5621 junit runner also uses it for reporting structured test failures -so it is more broadly useful than just service management.

          Show
          Steve Loughran added a comment - It's certainly part of HADOOP-3628 , and is on that branch, but my little HADOOP-5621 junit runner also uses it for reporting structured test failures -so it is more broadly useful than just service management.
          Hide
          Steve Loughran added a comment -

          This is the ThrowableWritable updated with (working) JUnit4 tests. Although I need this for the lifecycle stuff, it's also needed if you are trying to include exceptions in any work, such as running Junit tests as part of an MR job

          Show
          Steve Loughran added a comment - This is the ThrowableWritable updated with (working) JUnit4 tests. Although I need this for the lifecycle stuff, it's also needed if you are trying to include exceptions in any work, such as running Junit tests as part of an MR job
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12428842/HADOOP-5348-2.patch
          against trunk revision 893490.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/233/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/233/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/233/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/233/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12428842/HADOOP-5348-2.patch against trunk revision 893490. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/233/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/233/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/233/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/233/console This message is automatically generated.
          Hide
          Chris Douglas added a comment -
          • This pattern in equals is unnecessarily convoluted:
            +    return !(message != null
            +        ? !message.equals(that.message)
            +        : that.message != null);
            

            Using tenary operators in if expressions is similarly over-elaborate.

          • Throwables with cycles in causes will cause StackOverflowError; consider a max depth
          • The clone and copy constructor aren't really required, but why not call super.clone()?
          • Why a final class?

          Again, I'm not sure this is a general type that needs to be included in the framework. The issues it blocks (save the lifecycle work) have no code attached to them, making its utility more difficult to ascertain. The latest patches to HDFS-326 and HADOOP-6194 also include this code, and neither uses this class.

          I'd suggest closing this as "won't fix" and link it as incorporated in the lifecycle work.

          Show
          Chris Douglas added a comment - This pattern in equals is unnecessarily convoluted: + return !(message != null + ? !message.equals(that.message) + : that.message != null); Using tenary operators in if expressions is similarly over-elaborate. Throwables with cycles in causes will cause StackOverflowError ; consider a max depth The clone and copy constructor aren't really required, but why not call super.clone()? Why a final class? Again, I'm not sure this is a general type that needs to be included in the framework. The issues it blocks (save the lifecycle work) have no code attached to them, making its utility more difficult to ascertain. The latest patches to HDFS-326 and HADOOP-6194 also include this code, and neither uses this class. I'd suggest closing this as "won't fix" and link it as incorporated in the lifecycle work.
          Hide
          Steve Loughran added a comment -
          1. I agree, there's no compelling need yet, but as I'm also using it my junit-as-MR jobs test runner, I've teased it out
          2. And it's in its own Git branch on github: HADOOP-5348-ThrowableWritable
          3. Marked as final as when you are trying to diagnose why things failed, it's not useful to get an exception that you can't deserialize. Seen that too often in RMI-land to want to enjoy.
          4. I'll look at the equals(), copy(), clone(); they are probably mostly IDE generated.
          Show
          Steve Loughran added a comment - I agree, there's no compelling need yet, but as I'm also using it my junit-as-MR jobs test runner, I've teased it out And it's in its own Git branch on github: HADOOP-5348-ThrowableWritable Marked as final as when you are trying to diagnose why things failed, it's not useful to get an exception that you can't deserialize. Seen that too often in RMI-land to want to enjoy. I'll look at the equals(), copy(), clone(); they are probably mostly IDE generated.

            People

            • Assignee:
              Steve Loughran
              Reporter:
              Steve Loughran
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development