Hadoop Common
  1. Hadoop Common
  2. HADOOP-7342

Add an utility API in FileUtil for JDK File.list

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.23.0
    • Fix Version/s: 0.23.0
    • Component/s: None
    • Labels:
      None

      Description

      Java File.list API can return null when disk is bad or directory is not a directory. This utility API in FileUtil will throw an exception when this happens rather than returning null.

      1. HADOOP-7342-1.patch
        2 kB
        Bharath Mundlapudi
      2. HADOOP-7342-2.patch
        2 kB
        Matt Foley

        Issue Links

          Activity

          Hide
          Bharath Mundlapudi added a comment -

          Attaching a patch for this.

          Show
          Bharath Mundlapudi added a comment - Attaching a patch for this.
          Hide
          Konstantin Boudnik added a comment -

          Coupla comments Bharath:

          • this sounds like you are trying to use exceptions to replace return value checks. Now you'd have to check IOException around all the calls to the new API. Besides, you're kinda violating original File.list() contract which doesn't declare IOException

          Also, asserts in the tests are better to have messages to ease detection of test failures

          Show
          Konstantin Boudnik added a comment - Coupla comments Bharath: this sounds like you are trying to use exceptions to replace return value checks. Now you'd have to check IOException around all the calls to the new API. Besides, you're kinda violating original File.list() contract which doesn't declare IOException Also, asserts in the tests are better to have messages to ease detection of test failures
          Hide
          Bharath Mundlapudi added a comment -

          Hi Konstantin,

          Thanks for your comments and for reviewing the code.

          My comments:
          If you see most of the code in HDFS does have IOException handling around most of the methods. Only the problem is NPE are not handled. We have seen in our tests, NPE can have bad impact on certain cases when disks go bad.

          IMO, this is the problem with JDK API. Instead of asking user to have these null checks everywhere for these calls, API should have been consistent with the return values.

          Regarding asserts, I have not put the messages for assertEquals only because this call will assert saying what is expected and actual values should be if it fails. This will be clear from the logs that assertEquals failed.

          Show
          Bharath Mundlapudi added a comment - Hi Konstantin, Thanks for your comments and for reviewing the code. My comments: If you see most of the code in HDFS does have IOException handling around most of the methods. Only the problem is NPE are not handled. We have seen in our tests, NPE can have bad impact on certain cases when disks go bad. IMO, this is the problem with JDK API. Instead of asking user to have these null checks everywhere for these calls, API should have been consistent with the return values. Regarding asserts, I have not put the messages for assertEquals only because this call will assert saying what is expected and actual values should be if it fails. This will be clear from the logs that assertEquals failed.
          Hide
          Matt Foley added a comment -

          Cos, the reason I support this new utility api, is that Hadoop is full of constructs like

          // in context of a "throws IOException" or try/catch IOE:
          File[] files = dir.list();
          if (files.length > 0) {...}
          

          This code handles empty and non-empty directory lists, and is ready to handle IOExceptions. Unfortunately, the posix-style File.list() api returns null when it would reasonably be expected to throw IOException, causing NPE on the .length method call, which as a run-time exception isn't typically even looked for.

          Rather than inject null-checking code with a conditional throw statement into every place like this, Bharath has suggested calling this new FileUtil method. Developers can still use the JDK method when the null return on IOE is desirable. I think this is a good idea.

          Show
          Matt Foley added a comment - Cos, the reason I support this new utility api, is that Hadoop is full of constructs like // in context of a " throws IOException" or try / catch IOE: File[] files = dir.list(); if (files.length > 0) {...} This code handles empty and non-empty directory lists, and is ready to handle IOExceptions. Unfortunately, the posix-style File.list() api returns null when it would reasonably be expected to throw IOException, causing NPE on the .length method call, which as a run-time exception isn't typically even looked for. Rather than inject null-checking code with a conditional throw statement into every place like this, Bharath has suggested calling this new FileUtil method. Developers can still use the JDK method when the null return on IOE is desirable. I think this is a good idea.
          Hide
          Matt Foley added a comment -

          oops, that's String[] not File[]. Same logic for .listFiles()

          Show
          Matt Foley added a comment - oops, that's String[] not File[]. Same logic for .listFiles()
          Hide
          Konstantin Boudnik added a comment -

          Matt, according to Jacob's comment this API chance has been already discussed and I have missed that part. So, taking it back: let's go with this (although it seems unusual

          Show
          Konstantin Boudnik added a comment - Matt, according to Jacob's comment this API chance has been already discussed and I have missed that part. So, taking it back: let's go with this (although it seems unusual
          Hide
          Konstantin Boudnik added a comment -

          This will be clear from the logs that assertEquals failed.

          Bharath, it will be apparent that it failed, but won't clear why without looking into the test's code.

          Show
          Konstantin Boudnik added a comment - This will be clear from the logs that assertEquals failed. Bharath, it will be apparent that it failed, but won't clear why without looking into the test's code.
          Hide
          Matt Foley added a comment -

          Added the Assert messages Cos requested, and trigger patch submission.

          Show
          Matt Foley added a comment - Added the Assert messages Cos requested, and trigger patch submission.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12481365/HADOOP-7342-2.patch
          against trunk revision 1130833.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/568//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/568//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/568//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481365/HADOOP-7342-2.patch against trunk revision 1130833. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/568//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/568//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/568//console This message is automatically generated.
          Hide
          Matt Foley added a comment -

          Looks good. Cos, can you +1 it?

          Show
          Matt Foley added a comment - Looks good. Cos, can you +1 it?
          Hide
          Matt Foley added a comment -

          If we don't hear from Cos by end of day, I'll commit this since we accepted his comment and passed Jenkins test, and no one else has commented in four days. Also, this just mirrors the already committed HADOOP-7322. Thanks.

          Show
          Matt Foley added a comment - If we don't hear from Cos by end of day, I'll commit this since we accepted his comment and passed Jenkins test, and no one else has commented in four days. Also, this just mirrors the already committed HADOOP-7322 . Thanks.
          Hide
          Matt Foley added a comment -

          Committed to trunk. Thanks, Bharath! And thanks Cos for helping review.

          Show
          Matt Foley added a comment - Committed to trunk. Thanks, Bharath! And thanks Cos for helping review.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk #709 (See https://builds.apache.org/hudson/job/Hadoop-Common-trunk/709/)
          HADOOP-7342. Add an utility API in FileUtil for JDK File.list avoid NPEs on File.list(). Contributed by Bharath Mundlapudi.

          mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1131330
          Files :

          • /hadoop/common/trunk/CHANGES.txt
          • /hadoop/common/trunk/src/java/org/apache/hadoop/fs/FileUtil.java
          • /hadoop/common/trunk/src/test/core/org/apache/hadoop/fs/TestFileUtil.java
          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk #709 (See https://builds.apache.org/hudson/job/Hadoop-Common-trunk/709/ ) HADOOP-7342 . Add an utility API in FileUtil for JDK File.list avoid NPEs on File.list(). Contributed by Bharath Mundlapudi. mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1131330 Files : /hadoop/common/trunk/CHANGES.txt /hadoop/common/trunk/src/java/org/apache/hadoop/fs/FileUtil.java /hadoop/common/trunk/src/test/core/org/apache/hadoop/fs/TestFileUtil.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #641 (See https://builds.apache.org/hudson/job/Hadoop-Common-trunk-Commit/641/)

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #641 (See https://builds.apache.org/hudson/job/Hadoop-Common-trunk-Commit/641/ )

            People

            • Assignee:
              Bharath Mundlapudi
              Reporter:
              Bharath Mundlapudi
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development