Hadoop Common
  1. Hadoop Common
  2. HADOOP-8633

Interrupted FsShell copies may leave tmp files

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.23.0, 2.0.0-alpha, 3.0.0
    • Fix Version/s: 0.23.3, 2.0.2-alpha
    • Component/s: fs
    • Labels:
      None

      Description

      Interrupting a copy, ex. via SIGINT, may cause tmp files to not be removed. If the user is copying large files then the remnants will eat into the user's quota.

        Issue Links

          Activity

          Hide
          Daryn Sharp added a comment -

          deleteOnExit issues contribute and/or are related to the problem

          Show
          Daryn Sharp added a comment - deleteOnExit issues contribute and/or are related to the problem
          Hide
          Daryn Sharp added a comment -

          Will submit after deps are checked in.

          Show
          Daryn Sharp added a comment - Will submit after deps are checked in.
          Hide
          Kihwal Lee added a comment -

          Looks good to me. I like the use of TargetFileSystem.

          Show
          Kihwal Lee added a comment - Looks good to me. I like the use of TargetFileSystem .
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12538590/HADOOP-8633.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1240//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1240//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12538590/HADOOP-8633.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1240//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1240//console This message is automatically generated.
          Hide
          Robert Joseph Evans added a comment -

          The changes look OK. I'm not sure I really like TargetFileSystem. I just don't see a lot of benefit from making TargetFileSystem. I think it would be cleaner to just have your own list of Paths to delete and then delete them in the finally block. Having it be a FileSystem just seems confusing to me.

          Because TargetFileSystem is never added to the FileSystem cache I assume that if a System.exit is called somewhere while processing the command the TargetFileSystem will not be closed and the files will not actually be deleted on exit. Is this important?

          Show
          Robert Joseph Evans added a comment - The changes look OK. I'm not sure I really like TargetFileSystem. I just don't see a lot of benefit from making TargetFileSystem. I think it would be cleaner to just have your own list of Paths to delete and then delete them in the finally block. Having it be a FileSystem just seems confusing to me. Because TargetFileSystem is never added to the FileSystem cache I assume that if a System.exit is called somewhere while processing the command the TargetFileSystem will not be closed and the files will not actually be deleted on exit. Is this important?
          Hide
          Daryn Sharp added a comment -

          The TargetFileSystem is just a shim over a real filesystem that is registering and canceling temp paths for deletion. The shim simplifies all the code performing the copy and helps ensure the temp files are cancelled and/or deleted immediately. I originally did what you suggest and I wound up with multiple nested try blocks and conditions that made the code (imho) harder to read, understand, and difficult to test.

          It's true that a call to System.exit won't cleanup the filesystem, but it's only called as the last line in main. Calling it in other places would break functionality and be a bug.

          (I did manually run a copy with 100 iterations and pounded on control-c and no remnants where left)

          Show
          Daryn Sharp added a comment - The TargetFileSystem is just a shim over a real filesystem that is registering and canceling temp paths for deletion. The shim simplifies all the code performing the copy and helps ensure the temp files are cancelled and/or deleted immediately. I originally did what you suggest and I wound up with multiple nested try blocks and conditions that made the code (imho) harder to read, understand, and difficult to test. It's true that a call to System.exit won't cleanup the filesystem, but it's only called as the last line in main . Calling it in other places would break functionality and be a bug. (I did manually run a copy with 100 iterations and pounded on control-c and no remnants where left)
          Hide
          Robert Joseph Evans added a comment -

          Ok +1

          Show
          Robert Joseph Evans added a comment - Ok +1
          Hide
          Thomas Graves added a comment -

          I went ahead and committed this. Thanks Daryn, Bobby, and Kihwal!

          Show
          Thomas Graves added a comment - I went ahead and committed this. Thanks Daryn, Bobby, and Kihwal!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #2612 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2612/)
          HADOOP-8633. Interrupted FsShell copies may leave tmp files (Daryn Sharp via tgraves) (Revision 1368002)

          Result = SUCCESS
          tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1368002
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopy.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #2612 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2612/ ) HADOOP-8633 . Interrupted FsShell copies may leave tmp files (Daryn Sharp via tgraves) (Revision 1368002) Result = SUCCESS tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1368002 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopy.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #2547 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2547/)
          HADOOP-8633. Interrupted FsShell copies may leave tmp files (Daryn Sharp via tgraves) (Revision 1368002)

          Result = SUCCESS
          tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1368002
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopy.java
          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #2547 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2547/ ) HADOOP-8633 . Interrupted FsShell copies may leave tmp files (Daryn Sharp via tgraves) (Revision 1368002) Result = SUCCESS tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1368002 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopy.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #2565 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2565/)
          HADOOP-8633. Interrupted FsShell copies may leave tmp files (Daryn Sharp via tgraves) (Revision 1368002)

          Result = FAILURE
          tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1368002
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopy.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #2565 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2565/ ) HADOOP-8633 . Interrupted FsShell copies may leave tmp files (Daryn Sharp via tgraves) (Revision 1368002) Result = FAILURE tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1368002 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopy.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-0.23-Build #332 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/332/)
          merge -r 1368002:1368003 from branch-2. FIXES: HADOOP-8633 (Revision 1368004)

          Result = SUCCESS
          tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1368004
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java
          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java
          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopy.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #332 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/332/ ) merge -r 1368002:1368003 from branch-2. FIXES: HADOOP-8633 (Revision 1368004) Result = SUCCESS tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1368004 Files : /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopy.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #1123 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1123/)
          HADOOP-8633. Interrupted FsShell copies may leave tmp files (Daryn Sharp via tgraves) (Revision 1368002)

          Result = SUCCESS
          tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1368002
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopy.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1123 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1123/ ) HADOOP-8633 . Interrupted FsShell copies may leave tmp files (Daryn Sharp via tgraves) (Revision 1368002) Result = SUCCESS tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1368002 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopy.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #1155 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1155/)
          HADOOP-8633. Interrupted FsShell copies may leave tmp files (Daryn Sharp via tgraves) (Revision 1368002)

          Result = FAILURE
          tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1368002
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopy.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1155 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1155/ ) HADOOP-8633 . Interrupted FsShell copies may leave tmp files (Daryn Sharp via tgraves) (Revision 1368002) Result = FAILURE tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1368002 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopy.java

            People

            • Assignee:
              Daryn Sharp
              Reporter:
              Daryn Sharp
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development