Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-6262

HDFS doesn't raise FileNotFoundException if the source of a rename() is missing

    Details

    • Type: Bug Bug
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 2.4.0
    • Fix Version/s: None
    • Component/s: namenode
    • Labels:
    • Hadoop Flags:
      Incompatible change
    • Release Note:
      HDFS's rename(src, dst) throws FileNotFoundException instead of logging warn message and returning false if the source of the rename does not exist.

      Description

      HDFS's rename(src, dest) returns false if src does not exist -all the other filesystems raise FileNotFoundException

      This behaviour is defined in FSDirectory.unprotectedRenameTo() -the attempt is logged, but the operation then just returns false.

      I propose changing the behaviour of DistributedFileSystem to be the same as that of the others -and of FileContext, which does reject renames with nonexistent sources

      1. HDFS-6262.2.patch
        9 kB
        Akira AJISAKA
      2. HDFS-6262.patch
        4 kB
        Akira AJISAKA

        Issue Links

          Activity

          Hide
          Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          -1 patch 0m 0s The patch command could not apply the patch during dryrun.



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12641192/HDFS-6262.2.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / f1a152c
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/10654/console

          This message was automatically generated.

          Show
          Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 patch 0m 0s The patch command could not apply the patch during dryrun. Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12641192/HDFS-6262.2.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / f1a152c Console output https://builds.apache.org/job/PreCommit-HDFS-Build/10654/console This message was automatically generated.
          Hide
          Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          -1 patch 0m 0s The patch command could not apply the patch during dryrun.



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12641192/HDFS-6262.2.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / f1a152c
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/10632/console

          This message was automatically generated.

          Show
          Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 patch 0m 0s The patch command could not apply the patch during dryrun. Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12641192/HDFS-6262.2.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / f1a152c Console output https://builds.apache.org/job/PreCommit-HDFS-Build/10632/console This message was automatically generated.
          Hide
          Suresh Srinivas added a comment -

          But: every other filesystem considers renaming a file that doesn't exist to be an error.

          I agree. However, some of our methods have two ways to indicate failure - return false or thrown an exception and there lies the problem.

          Given that applications must handle exception being thrown from these methods, changing the behavior for HDFS should be okay. But we do not how all the apps use this API and I suspect we will break some applications. Especially case 2 you pointed in your comments. One thing I was thinking of was to possibly have a hidden configuration to revert back to old behavior in HDFS. But that is pretty ugly.

          Again, I feel we should leave the method as is in HDFS. But I am okay if you want to go ahead and make this change. We should perhaps document it and hope that not too many applications break.

          Show
          Suresh Srinivas added a comment - But: every other filesystem considers renaming a file that doesn't exist to be an error. I agree. However, some of our methods have two ways to indicate failure - return false or thrown an exception and there lies the problem. Given that applications must handle exception being thrown from these methods, changing the behavior for HDFS should be okay. But we do not how all the apps use this API and I suspect we will break some applications. Especially case 2 you pointed in your comments. One thing I was thinking of was to possibly have a hidden configuration to revert back to old behavior in HDFS. But that is pretty ugly. Again, I feel we should leave the method as is in HDFS. But I am okay if you want to go ahead and make this change. We should perhaps document it and hope that not too many applications break.
          Hide
          Steve Loughran added a comment -

          Suresh, thanks for the link to HADOOP-6240 -I hadn't seen that. But: every other filesystem considers renaming a file that doesn't exist to be an error.

          Do we have any examples where failing to fault on renaming a nonexistent file is NOT an error to flag up?

          Looking at the hadoop production source

          • org.apache.hadoop.fs.shell.MoveCommands says "we have no way to know the actual error..." and throws a PathIOException
          • org.apache.hadoop.fs.shell.CommandWithDestination says "too bad we don't know why it failed" and does the same
          • org.apache.hadoop.io.MapFile raises an IOException
          • org.apache.hadoop.tools.mapred.CopyCommitter raises an IOE, as does org.apache.hadoop.tools.mapred.RetriableFileCopyCommand

          Similar behaviour for:

          LocalContainerLauncher, DistCpV1
          org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
          org.apache.hadoop.mapreduce.v2.hs.HistoryServerFileSystemStateStoreService, 
          ...
          

          and those that blindly assume that rename's return value doesn't need checking

          JobHistoryEventHandler
          TaskLog (on localFS though)
          org.apache.hadoop.mapreduce.task.reduce.OnDiskMapOutput
          org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster
          org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl
          org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore
          
          

          In fact. the only bit of code I can see that converts the false return code to a warning is org.apache.hadoop.tools.mapred.lib.DynamicInputChunk

          To summarise, in the Hadoop production code, in all but one case the handling of a false return code takes two forms

          1. triggers the throwing of a "that failed but we don't know why" IOException
          2. is blissfully ignorant that the operation has failed, and has so far been lucky in avoiding concurrency problems with their source being renamed while they weren't looking.

          All of these uses benefit from having rename consistently throw a FileNotFoundException if the source file isn't there

          Show
          Steve Loughran added a comment - Suresh, thanks for the link to HADOOP-6240 -I hadn't seen that. But: every other filesystem considers renaming a file that doesn't exist to be an error. Do we have any examples where failing to fault on renaming a nonexistent file is NOT an error to flag up? Looking at the hadoop production source org.apache.hadoop.fs.shell.MoveCommands says "we have no way to know the actual error..." and throws a PathIOException org.apache.hadoop.fs.shell.CommandWithDestination says "too bad we don't know why it failed" and does the same org.apache.hadoop.io.MapFile raises an IOException org.apache.hadoop.tools.mapred.CopyCommitter raises an IOE, as does org.apache.hadoop.tools.mapred.RetriableFileCopyCommand Similar behaviour for: LocalContainerLauncher, DistCpV1 org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter org.apache.hadoop.mapreduce.v2.hs.HistoryServerFileSystemStateStoreService, ... and those that blindly assume that rename's return value doesn't need checking JobHistoryEventHandler TaskLog (on localFS though) org.apache.hadoop.mapreduce.task.reduce.OnDiskMapOutput org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore In fact. the only bit of code I can see that converts the false return code to a warning is org.apache.hadoop.tools.mapred.lib.DynamicInputChunk To summarise, in the Hadoop production code, in all but one case the handling of a false return code takes two forms triggers the throwing of a "that failed but we don't know why" IOException is blissfully ignorant that the operation has failed, and has so far been lucky in avoiding concurrency problems with their source being renamed while they weren't looking. All of these uses benefit from having rename consistently throw a FileNotFoundException if the source file isn't there
          Hide
          Suresh Srinivas added a comment -

          Steve Loughran, given rename had different behaviors on different file systems, we did not want to change it and leave it alone (as discussed in HADOOP-6240). That way applications that are dealing with or dependent different behaviors of the file systems continue to work. I think we should deprecate old rename and and not test it for consistent behavior. Lets only test the new rename for consistency across FileSystems.

          Show
          Suresh Srinivas added a comment - Steve Loughran , given rename had different behaviors on different file systems, we did not want to change it and leave it alone (as discussed in HADOOP-6240 ). That way applications that are dealing with or dependent different behaviors of the file systems continue to work. I think we should deprecate old rename and and not test it for consistent behavior. Lets only test the new rename for consistency across FileSystems.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Suresh Srinivas, since you have worked on HADOOP-6240 and HDFS-654, could you take a look?

          Show
          Tsz Wo Nicholas Sze added a comment - Suresh Srinivas , since you have worked on HADOOP-6240 and HDFS-654 , could you take a look?
          Hide
          Steve Loughran added a comment -

          it looks OK to me, but one of the HDFS experts has to vote on the details.

          What happens if you apply the HADOOP-9361 patch, with its HDFS test failing, and then apply this? As those are my rename behaviour tests?

          Show
          Steve Loughran added a comment - it looks OK to me, but one of the HDFS experts has to vote on the details. What happens if you apply the HADOOP-9361 patch, with its HDFS test failing, and then apply this? As those are my rename behaviour tests?
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12641192/HDFS-6262.2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 5 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6690//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6690//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12641192/HDFS-6262.2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 5 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6690//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6690//console This message is automatically generated.
          Hide
          Akira AJISAKA added a comment -

          I suggest rename(src, dst) to call FSDirectory.renameTo(src, dst, Rename.NONE).

          I tried, but got a lot of errors.

          Tests in error:
           TestHDFSFileSystemContract>FileSystemContractBaseTest.testRenameFileMoveToNonExistentDirectory:358->FileSystemContractBaseTest.rename:489 » FileNotFound
           TestHDFSFileSystemContract>FileSystemContractBaseTest.testRenameFileAsExistingFile:378->FileSystemContractBaseTest.rename:489 » FileAlreadyExists
           TestHDFSFileSystemContract>FileSystemContractBaseTest.testRenameFileAsExistingDirectory:388->FileSystemContractBaseTest.rename:489 » Remote
           TestHDFSFileSystemContract>FileSystemContractBaseTest.testRenameDirectoryMoveToNonExistentDirectory:399->FileSystemContractBaseTest.rename:489 » FileNotFound
           TestHDFSFileSystemContract>FileSystemContractBaseTest.testRenameDirectoryAsExistingFile:431->FileSystemContractBaseTest.rename:489 » Remote
           TestHDFSFileSystemContract>FileSystemContractBaseTest.testRenameDirectoryAsExistingDirectory:444->FileSystemContractBaseTest.rename:489 » FileAlreadyExists
           TestHDFSFileSystemContract>FileSystemContractBaseTest.testRenameRootDirForbidden:598->FileSystemContractBaseTest.rename:489 » Remote
           TestHDFSFileSystemContract>FileSystemContractBaseTest.testRenameChildDirForbidden:617->FileSystemContractBaseTest.rename:489 » Remote
           TestHDFSFileSystemContract>FileSystemContractBaseTest.testRenameDirToSelf:649->FileSystemContractBaseTest.rename:489 » FileAlreadyExists
           TestHDFSFileSystemContract>FileSystemContractBaseTest.testMoveDirUnderParent:667 » FileAlreadyExists
           TestHDFSFileSystemContract>FileSystemContractBaseTest.testRenameFileToSelf:681->FileSystemContractBaseTest.rename:489 » FileAlreadyExists
           TestHDFSFileSystemContract>FileSystemContractBaseTest.testMoveFileUnderParent:695->FileSystemContractBaseTest.rename:489 » FileAlreadyExists
          

          The suggestion will change some other behaviors. I think it's better to make the behavior changes smaller for this JIRA, and now I suggest the v2 patch.
          Steve Loughran and Tsz Wo Nicholas Sze, what do you think?

          Show
          Akira AJISAKA added a comment - I suggest rename(src, dst) to call FSDirectory.renameTo(src, dst, Rename.NONE). I tried, but got a lot of errors. Tests in error: TestHDFSFileSystemContract>FileSystemContractBaseTest.testRenameFileMoveToNonExistentDirectory:358->FileSystemContractBaseTest.rename:489 » FileNotFound TestHDFSFileSystemContract>FileSystemContractBaseTest.testRenameFileAsExistingFile:378->FileSystemContractBaseTest.rename:489 » FileAlreadyExists TestHDFSFileSystemContract>FileSystemContractBaseTest.testRenameFileAsExistingDirectory:388->FileSystemContractBaseTest.rename:489 » Remote TestHDFSFileSystemContract>FileSystemContractBaseTest.testRenameDirectoryMoveToNonExistentDirectory:399->FileSystemContractBaseTest.rename:489 » FileNotFound TestHDFSFileSystemContract>FileSystemContractBaseTest.testRenameDirectoryAsExistingFile:431->FileSystemContractBaseTest.rename:489 » Remote TestHDFSFileSystemContract>FileSystemContractBaseTest.testRenameDirectoryAsExistingDirectory:444->FileSystemContractBaseTest.rename:489 » FileAlreadyExists TestHDFSFileSystemContract>FileSystemContractBaseTest.testRenameRootDirForbidden:598->FileSystemContractBaseTest.rename:489 » Remote TestHDFSFileSystemContract>FileSystemContractBaseTest.testRenameChildDirForbidden:617->FileSystemContractBaseTest.rename:489 » Remote TestHDFSFileSystemContract>FileSystemContractBaseTest.testRenameDirToSelf:649->FileSystemContractBaseTest.rename:489 » FileAlreadyExists TestHDFSFileSystemContract>FileSystemContractBaseTest.testMoveDirUnderParent:667 » FileAlreadyExists TestHDFSFileSystemContract>FileSystemContractBaseTest.testRenameFileToSelf:681->FileSystemContractBaseTest.rename:489 » FileAlreadyExists TestHDFSFileSystemContract>FileSystemContractBaseTest.testMoveFileUnderParent:695->FileSystemContractBaseTest.rename:489 » FileAlreadyExists The suggestion will change some other behaviors. I think it's better to make the behavior changes smaller for this JIRA, and now I suggest the v2 patch. Steve Loughran and Tsz Wo Nicholas Sze , what do you think?
          Hide
          Akira AJISAKA added a comment -

          I'm thinking the problem is rename(src, dst) calls deprecated FSDirectory.renameTo(src, dst), which doesn't throw FileNotFoundException even if src does not exist.
          I suggest rename(src, dst) to call FSDirectory.renameTo(src, dst, Rename.NONE).

          Show
          Akira AJISAKA added a comment - I'm thinking the problem is rename(src, dst) calls deprecated FSDirectory.renameTo(src, dst), which doesn't throw FileNotFoundException even if src does not exist. I suggest rename(src, dst) to call FSDirectory.renameTo(src, dst, Rename.NONE).
          Hide
          Steve Loughran added a comment -

          Nicholas -problem exists in trunk; the HADOOP-9361 tests show this, as does the patch. Currently the namenode logs at warn and returns 0 if the source is missing, everything else throws FileNotFoundException

          Show
          Steve Loughran added a comment - Nicholas -problem exists in trunk; the HADOOP-9361 tests show this, as does the patch. Currently the namenode logs at warn and returns 0 if the source is missing, everything else throws FileNotFoundException
          Hide
          Akira AJISAKA added a comment -

          Tsz Wo Nicholas Sze, I agree with you. I'll update the patch to remove the deprecated methods.

          Show
          Akira AJISAKA added a comment - Tsz Wo Nicholas Sze , I agree with you. I'll update the patch to remove the deprecated methods.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          The problematic rename methods were deprecated by HDFS-654 for a long time. Should we simply remove them?

          Show
          Tsz Wo Nicholas Sze added a comment - The problematic rename methods were deprecated by HDFS-654 for a long time. Should we simply remove them?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          HADOOP-6240 is a closely related JIRA. Isn't the already fixed?

          Show
          Tsz Wo Nicholas Sze added a comment - HADOOP-6240 is a closely related JIRA. Isn't the already fixed?
          Hide
          Akira AJISAKA added a comment -

          Thanks Steve Loughran for the review. I update the patch to

          1. fix the test failures
          2. drop the log.warn message if the source of the rename does not exist

          we just need to get consent from the HDFS team

          I'm +1 (non-binding) to slightly change the behavior.

          Show
          Akira AJISAKA added a comment - Thanks Steve Loughran for the review. I update the patch to fix the test failures drop the log.warn message if the source of the rename does not exist we just need to get consent from the HDFS team I'm +1 (non-binding) to slightly change the behavior.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12641075/HDFS-6262.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.hdfs.TestHDFSFileSystemContract
          org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache
          org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract
          org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshot

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6686//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6686//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12641075/HDFS-6262.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestHDFSFileSystemContract org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshot +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6686//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6686//console This message is automatically generated.
          Hide
          Steve Loughran added a comment -

          looks a good initial patch; we just need to get consent from the HDFS team that

          1. the faulting rename action is the one we want
          2. its OK to drop the log.warn message. Given it's a client-side failure, dropping the warning seems the right choice.

          We are changing HDFS semantics slightly -we'll need to mention that iin the release nodes

          Show
          Steve Loughran added a comment - looks a good initial patch; we just need to get consent from the HDFS team that the faulting rename action is the one we want its OK to drop the log.warn message. Given it's a client-side failure, dropping the warning seems the right choice. We are changing HDFS semantics slightly -we'll need to mention that iin the release nodes
          Hide
          Akira AJISAKA added a comment -

          Attaching an initial patch.

          Show
          Akira AJISAKA added a comment - Attaching an initial patch.

            People

            • Assignee:
              Akira AJISAKA
              Reporter:
              Steve Loughran
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:

                Development