Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.21.0, 0.22.0
    • Fix Version/s: 0.21.0
    • Component/s: fs
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Per Sanjay's suggestion in HADOOP-6421 let's deprecate FileStatus#isDir() and add isDirectory() and isFile() to compliment isSymlink. Currently clients assume !isDir() implies a file, which is no longer true with symlinks. I'll file a separate jira to change the various uses of !isDir() to be isFile() or isFile() or isSymlink() as appropriate.

      1. hadoop-6585-5.patch
        44 kB
        Eli Collins
      2. hadoop-6585-4.patch
        44 kB
        Eli Collins
      3. hadoop-6585-3.patch
        45 kB
        Eli Collins
      4. hadoop-6585-2.patch
        45 kB
        Eli Collins
      5. hadoop-6585-1.patch
        35 kB
        Eli Collins

        Issue Links

          Activity

          Hide
          Eli Collins added a comment -

          This jira should also cover converting any uses of !isDir() in common code to isFile().

          Show
          Eli Collins added a comment - This jira should also cover converting any uses of !isDir() in common code to isFile().
          Hide
          Eli Collins added a comment -

          Patch attached. The change is fairly straightforward since almost all uses of isDir are called on FileStatus object retrieve using getFileStatus. Since getFileStatus fully resolves links the resulting status is either a file or a directory so !isDir can simply be replaced with isFile. Since FileSystem does not support symlinks isDir is equivalent to isDirectory, and isFile is equivalent to !isDir so those changes are straightforward as well. This patch applies against trunk with the patches for HADOOP-6563, and HADOOP-6678 applied.

          Show
          Eli Collins added a comment - Patch attached. The change is fairly straightforward since almost all uses of isDir are called on FileStatus object retrieve using getFileStatus. Since getFileStatus fully resolves links the resulting status is either a file or a directory so !isDir can simply be replaced with isFile. Since FileSystem does not support symlinks isDir is equivalent to isDirectory, and isFile is equivalent to !isDir so those changes are straightforward as well. This patch applies against trunk with the patches for HADOOP-6563 , and HADOOP-6678 applied.
          Hide
          Eli Collins added a comment -

          Need this in 21 since it preserves the old behavior of FileStatus#isDir.

          Show
          Eli Collins added a comment - Need this in 21 since it preserves the old behavior of FileStatus#isDir.
          Hide
          Eli Collins added a comment -

          Patch attached merges with trunk. Also:

          • Added missing rename test FileContextSymlinkBaseTest#testRenameFileWithDestParentSymlink that covers renaming a file to a path dst where the parent of dst is a symlink (eg rename dir/file to link/file for all types of link).
          • Made FileContextSymlinkBaseTest check more specific instances of caught exceptions
          • Remove unused checks for null in AFS#renameInternal, AFS#getFile*Status should never return null

          Will run hudson now that all the dependencies for this change are checked in.

          Show
          Eli Collins added a comment - Patch attached merges with trunk. Also: Added missing rename test FileContextSymlinkBaseTest#testRenameFileWithDestParentSymlink that covers renaming a file to a path dst where the parent of dst is a symlink (eg rename dir/file to link/file for all types of link). Made FileContextSymlinkBaseTest check more specific instances of caught exceptions Remove unused checks for null in AFS#renameInternal, AFS#getFile*Status should never return null Will run hudson now that all the dependencies for this change are checked in.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12443363/hadoop-6585-2.patch
          against trunk revision 939917.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 23 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The applied patch generated 1023 javac compiler warnings (more than the trunk's current 1017 warnings).

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/493/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/493/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/493/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/493/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12443363/hadoop-6585-2.patch against trunk revision 939917. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 23 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 1023 javac compiler warnings (more than the trunk's current 1017 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/493/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/493/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/493/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/493/console This message is automatically generated.
          Hide
          Eli Collins added a comment -

          The 6 new warnings are all calls to the now deprecated FileStatus#isDir that I added to FileContextSymlinkBaseTest.java. Confirmed this by diffing the output of ant -Djavac.args="-Xlint -Xmaxwarns 1000" clear test-core run with and w/o the patch.

          Show
          Eli Collins added a comment - The 6 new warnings are all calls to the now deprecated FileStatus#isDir that I added to FileContextSymlinkBaseTest.java. Confirmed this by diffing the output of ant -Djavac.args="-Xlint -Xmaxwarns 1000" clear test-core run with and w/o the patch.
          Hide
          Eli Collins added a comment -

          Attached patch merges with trunk.

          Show
          Eli Collins added a comment - Attached patch merges with trunk.
          Hide
          Eli Collins added a comment -

          The original symlinks change made FileStatus#isDir return false for symlinks because that seemed logical, but Sanjay pointed out in HADOOP-6421:

          I missed that this breaks compatibility in a subtle way: Negation of FileStatus#isDir() is normally used to see if the target is a file. This will now break if the target happens to be symlink.

          I assumed that in addition to deprecating FileStatus#isDir we need to preserve the old behavior that !isDir implies a file so that clients that use FileStatus and this common idiom that were written before symlinks won't break when symlinks are introduced, though these clients are going to break anyway (though perhaps less frequently) because they will see symlinks as directories since they were only written to check isDir. Therefore I propose that we do not modify the isDir behavior in this change (ie it returns trues iff the FileStatus represents a directory) and we just deprecate it. I'll update the patch.

          Show
          Eli Collins added a comment - The original symlinks change made FileStatus#isDir return false for symlinks because that seemed logical, but Sanjay pointed out in HADOOP-6421 : I missed that this breaks compatibility in a subtle way: Negation of FileStatus#isDir() is normally used to see if the target is a file. This will now break if the target happens to be symlink. I assumed that in addition to deprecating FileStatus#isDir we need to preserve the old behavior that !isDir implies a file so that clients that use FileStatus and this common idiom that were written before symlinks won't break when symlinks are introduced, though these clients are going to break anyway (though perhaps less frequently) because they will see symlinks as directories since they were only written to check isDir. Therefore I propose that we do not modify the isDir behavior in this change (ie it returns trues iff the FileStatus represents a directory) and we just deprecate it. I'll update the patch.
          Hide
          Hairong Kuang added a comment -

          Wow, Eli, you made a lot of tedious/subtle changes.. Good work!

          I agree that we should not change the definition of the deprecated method isDir. Here are a couple of initial comments. I might post more when I get more time to review carefully.
          FileSystem.java:
          1. should change the message of the exception ParentNotDirectoryException;
          2. should update the comment in getContentSummary: // f is a file --> // f is a file or symlink
          RawLocalFileSystem.java:
          should we change the check getFileStatus(f).isDir() to be !getFileStatus(f).isFile()?

          Show
          Hairong Kuang added a comment - Wow, Eli, you made a lot of tedious/subtle changes.. Good work! I agree that we should not change the definition of the deprecated method isDir. Here are a couple of initial comments. I might post more when I get more time to review carefully. FileSystem.java: 1. should change the message of the exception ParentNotDirectoryException; 2. should update the comment in getContentSummary: // f is a file --> // f is a file or symlink RawLocalFileSystem.java: should we change the check getFileStatus(f).isDir() to be !getFileStatus(f).isFile()?
          Hide
          Sanjay Radia added a comment -

          > Therefore I propose that we do not modify the isDir.
          That has also always been my position: not modify isDir() but simply deprecate and add a new isDirectory().
          When we added symlinks, isDir became ambiguous.
          Good news is that by time folks add actual symlinks to their namespaces, applications have had a chance to
          move to isDirectory().

          Show
          Sanjay Radia added a comment - > Therefore I propose that we do not modify the isDir. That has also always been my position: not modify isDir() but simply deprecate and add a new isDirectory(). When we added symlinks, isDir became ambiguous. Good news is that by time folks add actual symlinks to their namespaces, applications have had a chance to move to isDirectory().
          Hide
          Eli Collins added a comment -

          Thanks Hairong and Sanjay - glad we're all on the same page!

          Latest patch attached does not modify the behavior of FileStatus#isDir, just deprecates it, updates its comment and removes all the added isDir asserts in the tests that are now no longer necessary.

          I left the ParentNotDirectoryException messages in FileSystem as is since getFileStatus will always fully resolve the link, ie if the parent is not a directory it must be a file (as opposed to getFileLinkStatus which can return a FileStatus that represents a link). Lemme know if I missed something.

          Ditto the comment in getContentSummary is correct since f can't be a symlink, I modified that code to check isFile instead which is equivalent and I think more clear.

          Similarly in RawLocalFileSystem we want to check isDirectory in append since we are checking the FileStatus that the path fully resolves to, and we want to make sure that's not a directory. I should point out that in general the changes to FileSystem* are safe since isDir and isDirectory are now equivalent and FileSystem* always uses getFileStatus.

          Show
          Eli Collins added a comment - Thanks Hairong and Sanjay - glad we're all on the same page! Latest patch attached does not modify the behavior of FileStatus#isDir, just deprecates it, updates its comment and removes all the added isDir asserts in the tests that are now no longer necessary. I left the ParentNotDirectoryException messages in FileSystem as is since getFileStatus will always fully resolve the link, ie if the parent is not a directory it must be a file (as opposed to getFileLinkStatus which can return a FileStatus that represents a link). Lemme know if I missed something. Ditto the comment in getContentSummary is correct since f can't be a symlink, I modified that code to check isFile instead which is equivalent and I think more clear. Similarly in RawLocalFileSystem we want to check isDirectory in append since we are checking the FileStatus that the path fully resolves to, and we want to make sure that's not a directory. I should point out that in general the changes to FileSystem* are safe since isDir and isDirectory are now equivalent and FileSystem* always uses getFileStatus.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12443758/hadoop-6585-4.patch
          against trunk revision 940989.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 23 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/506/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/506/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/506/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/506/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12443758/hadoop-6585-4.patch against trunk revision 940989. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 23 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/506/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/506/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/506/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/506/console This message is automatically generated.
          Hide
          Tom White added a comment -

          This looks ready to commit. +1

          Minor nit - add @deprecated to FileStatus#isDir doc comment so that it shows up more clearly in generated Javadoc.

          Show
          Tom White added a comment - This looks ready to commit. +1 Minor nit - add @deprecated to FileStatus#isDir doc comment so that it shows up more clearly in generated Javadoc.
          Hide
          Eli Collins added a comment -

          Thanks Tom. Patch attached, addresses the nit.

          Show
          Eli Collins added a comment - Thanks Tom. Patch attached, addresses the nit.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12445723/hadoop-6585-5.patch
          against trunk revision 949032.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 23 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/552/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/552/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/552/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/552/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12445723/hadoop-6585-5.patch against trunk revision 949032. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 23 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/552/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/552/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/552/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/552/console This message is automatically generated.
          Hide
          Tom White added a comment -

          I've just committed this. Thanks Eli!

          Show
          Tom White added a comment - I've just committed this. Thanks Eli!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #272 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/272/)
          HADOOP-6585. Add FileStatus#isDirectory and isFile. Contributed by Eli Collins.

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #272 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/272/ ) HADOOP-6585 . Add FileStatus#isDirectory and isFile. Contributed by Eli Collins.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk #352 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/352/)
          HADOOP-6585. Add FileStatus#isDirectory and isFile. Contributed by Eli Collins.

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk #352 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/352/ ) HADOOP-6585 . Add FileStatus#isDirectory and isFile. Contributed by Eli Collins.

            People

            • Assignee:
              Eli Collins
              Reporter:
              Eli Collins
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development