Hadoop Common
  1. Hadoop Common
  2. HADOOP-5353

add progress callback feature to the slow FileUtil operations with ability to cancel the work

    Details

    • Type: Improvement Improvement
    • Status: Patch Available
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 0.21.0
    • Fix Version/s: None
    • Component/s: fs
    • Labels:
      None

      Description

      This is something only of relevance of people doing front ends to FS operations, and as they could take the code in FSUtil and add something with this feature, its a blocker to none of them.

      Current FileUtil.copy can take a long time to move large files around, but there is no progress indicator to GUIs, or a way to cancel the operation mid-way, j interrupting the thread or closing the filesystem.

      I propose a FileIOProgress interface to the copy ops, one that had a single method to notify listeners of bytes read and written, and the number of files handled.

      
      

      interface FileIOProgress

      { boolean progress(int files, long bytesRead, long bytesWritten); }

      The return value would be true to continue the operation, or false to stop the copy and leave the FS in whatever incomplete state it is in currently.

      it could even be fancier: have beginFileOperation and endFileOperation callbacks to pass in the name of the current file being worked on, though I don't have a personal need for that.

      GUIs could show progress bars and cancel buttons, other tools could use the interface to pass any cancellation notice upstream.

      The FileUtil.copy operations would call this interface (blocking) after every block copy, so the frequency of invocation would depend on block size and network/disk speeds. Which is also why I don't propose having any percentage done indicators; it's too hard to predict percentage of time done for distributed file IO with any degree of accuracy.

      1. HADOOP-5353.000.patch
        5 kB
        Lei (Eddy) Xu
      2. HADOOP-5353.001.patch
        33 kB
        Pranav Prakash
      3. HADOOP-5353.002.patch
        38 kB
        Pranav Prakash

        Activity

        Hide
        Lei (Eddy) Xu added a comment -

        Upload a Progress interface for discussion. I would like to gather some feedbacks before integrating this patch to the current fs.shell.

        I imagine the refactored shell copy command should look like below:

        CopyCommands.java
        class Put extends ... {
        
        private Progress progress = new FileIOProgress();
        
        protected void processArguments(args ...) {
           ...
          try {
             processArguments(args);
          } catch (IOException e) {
             ...
          } catch (interruptedexception e) {
             progress.stop();
          } catch (StopProgress e) {
             LOG.info("Copy operation is stopped.");
          }
            ...
        } 
        

        Steve Loughran, would you take a look at this patch and give some advices?

        Thank you!

        Show
        Lei (Eddy) Xu added a comment - Upload a Progress interface for discussion. I would like to gather some feedbacks before integrating this patch to the current fs.shell . I imagine the refactored shell copy command should look like below: CopyCommands.java class Put extends ... { private Progress progress = new FileIOProgress(); protected void processArguments(args ...) { ... try { processArguments(args); } catch (IOException e) { ... } catch (interruptedexception e) { progress.stop(); } catch (StopProgress e) { LOG.info( "Copy operation is stopped." ); } ... } Steve Loughran , would you take a look at this patch and give some advices? Thank you!
        Hide
        Steve Loughran added a comment -

        well, that was a long time ago, wasn't it?

        Having a quick look at the patch the core design looks good

        • Progress should be IOProgress in case we add more types in future
        • StopProgress should be StopProgressException, and take a text string

        I see in this design a remote listener is expected to poll for status. I'd envisaged some kind of callback instead, but actually a progress counter would be more loosely coupled. Polling can be inefficient, so we should recommend to tools that they do a sleep+poll

        One of the fun things would actually be implementing a cancel operation; most bulk single-thread IO operations don't; the HTTP based object stores do all their work in close() so stopping the write isn't enough...

        Show
        Steve Loughran added a comment - well, that was a long time ago, wasn't it? Having a quick look at the patch the core design looks good Progress should be IOProgress in case we add more types in future StopProgress should be StopProgressException , and take a text string I see in this design a remote listener is expected to poll for status. I'd envisaged some kind of callback instead, but actually a progress counter would be more loosely coupled. Polling can be inefficient, so we should recommend to tools that they do a sleep+poll One of the fun things would actually be implementing a cancel operation; most bulk single-thread IO operations don't; the HTTP based object stores do all their work in close() so stopping the write isn't enough...
        Hide
        Lei (Eddy) Xu added a comment -

        Thank you Steve Loughran! Your comments are a great start point for me to adjust this patch!

        I will try to implement a callback to IOProgress now and see what I can get from there..

        Show
        Lei (Eddy) Xu added a comment - Thank you Steve Loughran ! Your comments are a great start point for me to adjust this patch! I will try to implement a callback to IOProgress now and see what I can get from there..
        Hide
        Pranav Prakash added a comment -

        Attached is a preliminary patch that allows one to pass an `IOProgress` object to the copy methods and poll the transfer status

        Show
        Pranav Prakash added a comment - Attached is a preliminary patch that allows one to pass an `IOProgress` object to the copy methods and poll the transfer status
        Hide
        Steve Loughran added a comment -

        Quick review

        • please don't change indentation needlessly, do follow hadoop code layout rules, import wildcard policy etc.
        • FileUtil.copy() should throw FileNotFoundException when things aren't there.
        • line 482 Have a look inside FileSystem.exists() and you'll see why you could cut that clause and just go with the else clause
        • 499. Exception.toString() over Exception.getMessage
        • line 529. I think there should be better handling than "false" if the destination dir couldn't be created, given its
          the complete failure of the operation.
        • like 546 & elsewhere. Use try-with-resources now the codebase is Java 7+

        CommandWithDestination: TargetFileSystem supports try-with-resources cleanup too. Same for writeStreamToFile

        regarding the test,

        1. look at [ https://github.com/steveloughran/formality/blob/master/styleguide/styleguide.md ] for how to write good asserts.
        2. In particular, assertTrue and assertFalse must have messages, and if some probe of exception value/state fails, just throw that exception again.
        3. There's lots of code in ContractTestUtils for writing/reading/probing filesystems
        Show
        Steve Loughran added a comment - Quick review please don't change indentation needlessly, do follow hadoop code layout rules, import wildcard policy etc. FileUtil.copy() should throw FileNotFoundException when things aren't there. line 482 Have a look inside FileSystem.exists() and you'll see why you could cut that clause and just go with the else clause 499. Exception.toString() over Exception.getMessage line 529. I think there should be better handling than "false" if the destination dir couldn't be created, given its the complete failure of the operation. like 546 & elsewhere. Use try-with-resources now the codebase is Java 7+ CommandWithDestination : TargetFileSystem supports try-with-resources cleanup too. Same for writeStreamToFile regarding the test, look at [ https://github.com/steveloughran/formality/blob/master/styleguide/styleguide.md ] for how to write good asserts. In particular, assertTrue and assertFalse must have messages, and if some probe of exception value/state fails, just throw that exception again. There's lots of code in ContractTestUtils for writing/reading/probing filesystems
        Hide
        Pranav Prakash added a comment -

        Thank you for the code review Steve!

        I've uploaded a revised version of the patch based on your feedback and cleaned up the duplicate code between the copy methods that take a progress handle and the ones that don’t.

        Show
        Pranav Prakash added a comment - Thank you for the code review Steve! I've uploaded a revised version of the patch based on your feedback and cleaned up the duplicate code between the copy methods that take a progress handle and the ones that don’t.
        Hide
        Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 24s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
        +1 mvninstall 8m 3s trunk passed
        +1 compile 6m 56s trunk passed
        +1 checkstyle 0m 27s trunk passed
        +1 mvnsite 0m 58s trunk passed
        +1 mvneclipse 0m 14s trunk passed
        +1 findbugs 1m 20s trunk passed
        +1 javadoc 0m 50s trunk passed
        +1 mvninstall 0m 37s the patch passed
        +1 compile 6m 52s the patch passed
        +1 javac 6m 52s the patch passed
        -1 checkstyle 0m 25s hadoop-common-project/hadoop-common: The patch generated 62 new + 134 unchanged - 6 fixed = 196 total (was 140)
        +1 mvnsite 0m 55s the patch passed
        +1 mvneclipse 0m 13s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        -1 findbugs 1m 36s hadoop-common-project/hadoop-common generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
        +1 javadoc 0m 48s the patch passed
        -1 unit 19m 35s hadoop-common in the patch failed.
        +1 asflicense 0m 19s The patch does not generate ASF License warnings.
        51m 24s



        Reason Tests
        FindBugs module:hadoop-common-project/hadoop-common
          Private method org.apache.hadoop.fs.FileUtil.copy(FileSystem, FileStatus, File, boolean, Configuration) is never called At FileUtil.java:boolean, Configuration) is never called At FileUtil.java:[line 354]
        Timed out junit tests org.apache.hadoop.http.TestHttpServerLifecycle



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:e2f6409
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12810910/HADOOP-5353.002.patch
        JIRA Issue HADOOP-5353
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 42497f147a95 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 5370a6f
        Default Java 1.8.0_91
        findbugs v3.0.0
        checkstyle https://builds.apache.org/job/PreCommit-HADOOP-Build/9836/artifact/patchprocess/diff-checkstyle-hadoop-common-project_hadoop-common.txt
        findbugs https://builds.apache.org/job/PreCommit-HADOOP-Build/9836/artifact/patchprocess/new-findbugs-hadoop-common-project_hadoop-common.html
        unit https://builds.apache.org/job/PreCommit-HADOOP-Build/9836/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
        Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/9836/testReport/
        modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
        Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/9836/console
        Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 24s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 8m 3s trunk passed +1 compile 6m 56s trunk passed +1 checkstyle 0m 27s trunk passed +1 mvnsite 0m 58s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 1m 20s trunk passed +1 javadoc 0m 50s trunk passed +1 mvninstall 0m 37s the patch passed +1 compile 6m 52s the patch passed +1 javac 6m 52s the patch passed -1 checkstyle 0m 25s hadoop-common-project/hadoop-common: The patch generated 62 new + 134 unchanged - 6 fixed = 196 total (was 140) +1 mvnsite 0m 55s the patch passed +1 mvneclipse 0m 13s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. -1 findbugs 1m 36s hadoop-common-project/hadoop-common generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) +1 javadoc 0m 48s the patch passed -1 unit 19m 35s hadoop-common in the patch failed. +1 asflicense 0m 19s The patch does not generate ASF License warnings. 51m 24s Reason Tests FindBugs module:hadoop-common-project/hadoop-common   Private method org.apache.hadoop.fs.FileUtil.copy(FileSystem, FileStatus, File, boolean, Configuration) is never called At FileUtil.java:boolean, Configuration) is never called At FileUtil.java: [line 354] Timed out junit tests org.apache.hadoop.http.TestHttpServerLifecycle Subsystem Report/Notes Docker Image:yetus/hadoop:e2f6409 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12810910/HADOOP-5353.002.patch JIRA Issue HADOOP-5353 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 42497f147a95 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 5370a6f Default Java 1.8.0_91 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HADOOP-Build/9836/artifact/patchprocess/diff-checkstyle-hadoop-common-project_hadoop-common.txt findbugs https://builds.apache.org/job/PreCommit-HADOOP-Build/9836/artifact/patchprocess/new-findbugs-hadoop-common-project_hadoop-common.html unit https://builds.apache.org/job/PreCommit-HADOOP-Build/9836/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/9836/testReport/ modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/9836/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.

          People

          • Assignee:
            Pranav Prakash
            Reporter:
            Steve Loughran
          • Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

            • Created:
              Updated:

              Development