Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-931

DFSClient.close doesn't rethrow exceptions thrown by automatically closed clients

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Critical Critical
    • Resolution: Unresolved
    • Affects Version/s: 0.21.0, 0.22.0
    • Fix Version/s: None
    • Component/s: hdfs-client
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change

      Description

      I've written a fault injection test that simply throws exceptions on every ack.readFields call in ResponseProcessor. This ought to raise an exception to the client trying to use the stream, but instead it ignores it silently.

      1. TEST-org.apache.hadoop.hdfs.TestFiPipelines.txt
        83 kB
        Todd Lipcon
      2. hdfs-931.txt
        5 kB
        Todd Lipcon
      3. hdfs-931.txt
        4 kB
        Todd Lipcon

        Activity

        Todd Lipcon created issue -
        Hide
        Todd Lipcon added a comment -

        Here's the failing unit test along with a log of its output.

        Show
        Todd Lipcon added a comment - Here's the failing unit test along with a log of its output.
        Todd Lipcon made changes -
        Field Original Value New Value
        Attachment TEST-org.apache.hadoop.hdfs.TestFiPipelines.txt [ 12431742 ]
        Attachment hdfs-931.txt [ 12431743 ]
        Todd Lipcon made changes -
        Assignee Todd Lipcon [ tlipcon ]
        Hide
        Todd Lipcon added a comment -

        This turns out to be a bit of a different bug/feature. The test case was using fs.close() in order to implicitly close the stream. This caused the stream to flush on close, which is where the errors occurred. Since LeaseChecker.close swallows exceptions, it never threw.

        Is there a reason that LeaseChecker.close does not rethrow these exceptions as a MultipleIOException?

        Show
        Todd Lipcon added a comment - This turns out to be a bit of a different bug/feature. The test case was using fs.close() in order to implicitly close the stream. This caused the stream to flush on close, which is where the errors occurred. Since LeaseChecker.close swallows exceptions, it never threw. Is there a reason that LeaseChecker.close does not rethrow these exceptions as a MultipleIOException?
        Todd Lipcon made changes -
        Summary Failing DFSOutputStream does not throw exception back to client DFSClient.close doesn't rethrow exceptions thrown by automatically closed clients
        Affects Version/s 0.20.1 [ 12314048 ]
        Priority Blocker [ 1 ] Critical [ 2 ]
        Hide
        Todd Lipcon added a comment -

        Here is a patch that modifies DFSClient.close (really LeaseChecker.close) in order to propagate IOExceptions from closed files.

        This is an incompatible change in a way, but I think it matches the semantics people expect.

        Show
        Todd Lipcon added a comment - Here is a patch that modifies DFSClient.close (really LeaseChecker.close) in order to propagate IOExceptions from closed files. This is an incompatible change in a way, but I think it matches the semantics people expect.
        Todd Lipcon made changes -
        Attachment hdfs-931.txt [ 12431749 ]
        Todd Lipcon made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        stack added a comment -

        Is there any value in letting exceptions out of a close? What will the application do with an exception thrown on a close?

        Show
        stack added a comment - Is there any value in letting exceptions out of a close? What will the application do with an exception thrown on a close?
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12431749/hdfs-931.txt
        against trunk revision 903906.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 4 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/107/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/107/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/107/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/107/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12431749/hdfs-931.txt against trunk revision 903906. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/107/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/107/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/107/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/107/console This message is automatically generated.
        Hide
        steve_l added a comment -

        The action must depend on the app. Logging the problem is a common action, but retrying the operation is another possibility.

        Show
        steve_l added a comment - The action must depend on the app. Logging the problem is a common action, but retrying the operation is another possibility.
        Hide
        dhruba borthakur added a comment -

        > Is there any value in letting exceptions out of a close? What will the application do with an exception thrown on a close?

        The application can know that the data it wrote to the file could be bad and trigger appropriate recovery. In fact, it is very important that this exception be thrown to the applications instead of silently ignoring it.

        Show
        dhruba borthakur added a comment - > Is there any value in letting exceptions out of a close? What will the application do with an exception thrown on a close? The application can know that the data it wrote to the file could be bad and trigger appropriate recovery. In fact, it is very important that this exception be thrown to the applications instead of silently ignoring it.
        Hide
        Todd Lipcon added a comment -

        I'm cancelling patch since the test failures actually all seem to be related to this.

        So do people agree on this direction? If so I will work through the test failures.

        Also, should we consider this a breaking change? I think so - people might be relying on this behavior (eg using fs.close() in finally blocks and not expecting it to throw).

        Show
        Todd Lipcon added a comment - I'm cancelling patch since the test failures actually all seem to be related to this. So do people agree on this direction? If so I will work through the test failures. Also, should we consider this a breaking change? I think so - people might be relying on this behavior (eg using fs.close() in finally blocks and not expecting it to throw).
        Todd Lipcon made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Affects Version/s 0.20.1 [ 12314048 ]
        Hide
        dhruba borthakur added a comment -

        This could break existing applications in a bad way!

        Show
        dhruba borthakur added a comment - This could break existing applications in a bad way!
        Hide
        Todd Lipcon added a comment -

        Agreed. Do we dare just break them and put it in the release notes as an incompatible change?

        One reason for doing so is that close() already is listed as throwing IOException. So it's not an API change, and in theory people should have catch blocks in place.

        Show
        Todd Lipcon added a comment - Agreed. Do we dare just break them and put it in the release notes as an incompatible change? One reason for doing so is that close() already is listed as throwing IOException. So it's not an API change, and in theory people should have catch blocks in place.
        Hide
        dhruba borthakur added a comment -

        I think we should make this change only in trunk. Do you have any use case why it needs to be back ported to already released branches?

        Show
        dhruba borthakur added a comment - I think we should make this change only in trunk. Do you have any use case why it needs to be back ported to already released branches?
        Hide
        Todd Lipcon added a comment -

        I think we should make this change only in trunk. Do you have any use case why it needs to be back ported to already released branches?

        I agree. Trunk only in my opinion.

        Show
        Todd Lipcon added a comment - I think we should make this change only in trunk. Do you have any use case why it needs to be back ported to already released branches? I agree. Trunk only in my opinion.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Mark this as Incompatible change.

        Show
        Tsz Wo Nicholas Sze added a comment - Mark this as Incompatible change.
        Tsz Wo Nicholas Sze made changes -
        Hadoop Flags [Incompatible change]

          People

          • Assignee:
            Todd Lipcon
            Reporter:
            Todd Lipcon
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:

              Development