Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-11786

Add support to make copyFromLocal multi threaded

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.0.0-beta1
    • Component/s: hdfs
    • Labels:
      None

      Description

      CopyFromLocal/Put is not currently multithreaded.

      In case, where there are multiple files which need to be uploaded to the hdfs, a single thread reads the file and then copies the data to the cluster.

      This copy to hdfs can be made faster by uploading multiple files in parallel.

      I am attaching the initial patch so that I can get some initial feedback.

      1. HDFS-11786.001.patch
        5 kB
        Mukul Kumar Singh
      2. HDFS-11786.002.patch
        14 kB
        Mukul Kumar Singh
      3. HDFS-11786.003.patch
        14 kB
        Mukul Kumar Singh
      4. HDFS-11786.004.patch
        17 kB
        Mukul Kumar Singh
      5. HDFS-11786.005.patch
        17 kB
        Mukul Kumar Singh

        Issue Links

          Activity

          Hide
          boky01 Andras Bokor added a comment -

          Thanks Anu Engineer for your answer. I uploaded a fix for HADOOP-14698 to make the two commands identical again.
          Could you guys please check?

          Show
          boky01 Andras Bokor added a comment - Thanks Anu Engineer for your answer. I uploaded a fix for HADOOP-14698 to make the two commands identical again. Could you guys please check?
          Hide
          anu Anu Engineer added a comment -

          Andras Bokor This feature request case from a customer requirement. Hence we fixed it for a specific case. We can certainly extend it.

          Show
          anu Anu Engineer added a comment - Andras Bokor This feature request case from a customer requirement. Hence we fixed it for a specific case. We can certainly extend it.
          Hide
          boky01 Andras Bokor added a comment -

          Anu Engineer, Mukul Kumar Singh,

          Is there any reason why not to apply this threading feature for -put as well?
          I think making that two command not-identical makes the usage more complicated.
          Please check HADOOP-14698 and share your thoughts.

          Show
          boky01 Andras Bokor added a comment - Anu Engineer , Mukul Kumar Singh , Is there any reason why not to apply this threading feature for -put as well? I think making that two command not-identical makes the usage more complicated. Please check HADOOP-14698 and share your thoughts.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12017 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12017/)
          HDFS-11786. Add support to make copyFromLocal multi threaded. (aengineer: rev 02b141ac6059323ec43e472ca36dc570fdca386f)

          • (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopyPreserveFlag.java
          • (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommands.java
          • (add) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopyFromLocal.java
          • (edit) hadoop-common-project/hadoop-common/src/test/resources/testConf.xml
          • (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/MoveCommands.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12017 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12017/ ) HDFS-11786 . Add support to make copyFromLocal multi threaded. (aengineer: rev 02b141ac6059323ec43e472ca36dc570fdca386f) (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopyPreserveFlag.java (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommands.java (add) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopyFromLocal.java (edit) hadoop-common-project/hadoop-common/src/test/resources/testConf.xml (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/MoveCommands.java
          Hide
          anu Anu Engineer added a comment -

          Mukul Kumar Singh Thank you for the contribution. I have committed this to the trunk.

          Show
          anu Anu Engineer added a comment - Mukul Kumar Singh Thank you for the contribution. I have committed this to the trunk.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 12s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 3 new or modified test files.
          +1 mvninstall 14m 48s trunk passed
          +1 compile 17m 21s trunk passed
          +1 checkstyle 0m 40s trunk passed
          +1 mvnsite 1m 18s trunk passed
          +1 findbugs 1m 44s trunk passed
          +1 javadoc 0m 58s trunk passed
          +1 mvninstall 0m 51s the patch passed
          +1 compile 13m 21s the patch passed
          +1 javac 13m 21s the patch passed
          +1 checkstyle 0m 42s the patch passed
          +1 mvnsite 1m 13s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 2s The patch has no ill-formed XML file.
          +1 findbugs 1m 55s the patch passed
          +1 javadoc 1m 0s the patch passed
          -1 unit 8m 46s hadoop-common in the patch failed.
          +1 asflicense 0m 37s The patch does not generate ASF License warnings.
          66m 41s



          Reason Tests
          Failed junit tests hadoop.ha.TestZKFailoverController
            hadoop.security.TestRaceWhenRelogin



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-11786
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12875400/HDFS-11786.005.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml
          uname Linux b8d5026c2d11 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / fa1aaee
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/20138/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20138/testReport/
          modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20138/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 12s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 3 new or modified test files. +1 mvninstall 14m 48s trunk passed +1 compile 17m 21s trunk passed +1 checkstyle 0m 40s trunk passed +1 mvnsite 1m 18s trunk passed +1 findbugs 1m 44s trunk passed +1 javadoc 0m 58s trunk passed +1 mvninstall 0m 51s the patch passed +1 compile 13m 21s the patch passed +1 javac 13m 21s the patch passed +1 checkstyle 0m 42s the patch passed +1 mvnsite 1m 13s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 2s The patch has no ill-formed XML file. +1 findbugs 1m 55s the patch passed +1 javadoc 1m 0s the patch passed -1 unit 8m 46s hadoop-common in the patch failed. +1 asflicense 0m 37s The patch does not generate ASF License warnings. 66m 41s Reason Tests Failed junit tests hadoop.ha.TestZKFailoverController   hadoop.security.TestRaceWhenRelogin Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-11786 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12875400/HDFS-11786.005.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml uname Linux b8d5026c2d11 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / fa1aaee Default Java 1.8.0_131 findbugs v3.1.0-RC1 unit https://builds.apache.org/job/PreCommit-HDFS-Build/20138/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20138/testReport/ modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20138/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          msingh Mukul Kumar Singh added a comment -

          Thanks for the review Anu Engineer, Last patch should fix the check style warnings as well.

          Here is how the new help for the command will look like

          HW13605:multi_thread_upload msingh$ hadoop-dist/target/hadoop-3.0.0-alpha4-SNAPSHOT/bin/hdfs dfs -help copyFromLocal
          -copyFromLocal [-f] [-p] [-l] [-d] [-t <thread count>] <localsrc> ... <dst> :
            Copy files from the local file system into fs. Copying fails if the file already
            exists, unless the -f flag is given.
            Flags:
                                                                                           
            -p                 Preserves access and modification times, ownership and the  
                               mode.                                                       
            -f                 Overwrites the destination if it already exists.            
            -t <thread count>  Number of threads to be used, default is 1.                 
            -l                 Allow DataNode to lazily persist the file to disk. Forces   
                               replication factor of 1. This flag will result in reduced   
                               durability. Use with care.                                  
            -d                 Skip creation of temporary file(<dst>._COPYING_).       
          
          Show
          msingh Mukul Kumar Singh added a comment - Thanks for the review Anu Engineer , Last patch should fix the check style warnings as well. Here is how the new help for the command will look like HW13605:multi_thread_upload msingh$ hadoop-dist/target/hadoop-3.0.0-alpha4-SNAPSHOT/bin/hdfs dfs -help copyFromLocal -copyFromLocal [-f] [-p] [-l] [-d] [-t <thread count>] <localsrc> ... <dst> : Copy files from the local file system into fs. Copying fails if the file already exists, unless the -f flag is given. Flags: -p Preserves access and modification times, ownership and the mode. -f Overwrites the destination if it already exists. -t <thread count> Number of threads to be used, default is 1. -l Allow DataNode to lazily persist the file to disk. Forces replication factor of 1. This flag will result in reduced durability. Use with care. -d Skip creation of temporary file(<dst>._COPYING_).
          Hide
          anu Anu Engineer added a comment -

          Mukul Kumar Singh Thanks for fixing the test issues. Could you please take a look at the checkstyle issues.

          Show
          anu Anu Engineer added a comment - Mukul Kumar Singh Thanks for fixing the test issues. Could you please take a look at the checkstyle issues.
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 13s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 3 new or modified test files.
          +1 mvninstall 14m 21s trunk passed
          +1 compile 17m 8s trunk passed
          +1 checkstyle 0m 49s trunk passed
          +1 mvnsite 1m 24s trunk passed
          +1 findbugs 1m 54s trunk passed
          +1 javadoc 0m 56s trunk passed
          +1 mvninstall 0m 49s the patch passed
          +1 compile 13m 1s the patch passed
          +1 javac 13m 1s the patch passed
          -0 checkstyle 0m 42s hadoop-common-project/hadoop-common: The patch generated 5 new + 66 unchanged - 0 fixed = 71 total (was 66)
          +1 mvnsite 1m 18s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 1s The patch has no ill-formed XML file.
          +1 findbugs 2m 6s the patch passed
          +1 javadoc 0m 56s the patch passed
          +1 unit 8m 43s hadoop-common in the patch passed.
          +1 asflicense 0m 39s The patch does not generate ASF License warnings.
          66m 12s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-11786
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12875391/HDFS-11786.004.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml
          uname Linux 4dd1f42f7a3b 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / fa1aaee
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/20136/artifact/patchprocess/diff-checkstyle-hadoop-common-project_hadoop-common.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20136/testReport/
          modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20136/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 13s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 3 new or modified test files. +1 mvninstall 14m 21s trunk passed +1 compile 17m 8s trunk passed +1 checkstyle 0m 49s trunk passed +1 mvnsite 1m 24s trunk passed +1 findbugs 1m 54s trunk passed +1 javadoc 0m 56s trunk passed +1 mvninstall 0m 49s the patch passed +1 compile 13m 1s the patch passed +1 javac 13m 1s the patch passed -0 checkstyle 0m 42s hadoop-common-project/hadoop-common: The patch generated 5 new + 66 unchanged - 0 fixed = 71 total (was 66) +1 mvnsite 1m 18s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 1s The patch has no ill-formed XML file. +1 findbugs 2m 6s the patch passed +1 javadoc 0m 56s the patch passed +1 unit 8m 43s hadoop-common in the patch passed. +1 asflicense 0m 39s The patch does not generate ASF License warnings. 66m 12s Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-11786 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12875391/HDFS-11786.004.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml uname Linux 4dd1f42f7a3b 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / fa1aaee Default Java 1.8.0_131 findbugs v3.1.0-RC1 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/20136/artifact/patchprocess/diff-checkstyle-hadoop-common-project_hadoop-common.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20136/testReport/ modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20136/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          anu Anu Engineer added a comment -

          Looks like the Unit test failures are related to this patch. Could you please take a look, Thanks

          Show
          anu Anu Engineer added a comment - Looks like the Unit test failures are related to this patch. Could you please take a look, Thanks
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 13s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
          +1 mvninstall 13m 6s trunk passed
          +1 compile 15m 33s trunk passed
          +1 checkstyle 0m 33s trunk passed
          +1 mvnsite 1m 8s trunk passed
          +1 findbugs 1m 29s trunk passed
          +1 javadoc 0m 51s trunk passed
          +1 mvninstall 0m 45s the patch passed
          +1 compile 13m 8s the patch passed
          +1 javac 13m 8s the patch passed
          -0 checkstyle 0m 40s hadoop-common-project/hadoop-common: The patch generated 17 new + 66 unchanged - 0 fixed = 83 total (was 66)
          +1 mvnsite 1m 15s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 50s the patch passed
          +1 javadoc 0m 51s the patch passed
          -1 unit 7m 45s hadoop-common in the patch failed.
          +1 asflicense 0m 28s The patch does not generate ASF License warnings.
          60m 34s



          Reason Tests
          Failed junit tests hadoop.cli.TestCLI
            hadoop.fs.shell.TestCopyFromLocal



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-11786
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12875360/HDFS-11786.003.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux a470da6267f1 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / fa1aaee
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/20129/artifact/patchprocess/diff-checkstyle-hadoop-common-project_hadoop-common.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/20129/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20129/testReport/
          modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20129/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 13s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 13m 6s trunk passed +1 compile 15m 33s trunk passed +1 checkstyle 0m 33s trunk passed +1 mvnsite 1m 8s trunk passed +1 findbugs 1m 29s trunk passed +1 javadoc 0m 51s trunk passed +1 mvninstall 0m 45s the patch passed +1 compile 13m 8s the patch passed +1 javac 13m 8s the patch passed -0 checkstyle 0m 40s hadoop-common-project/hadoop-common: The patch generated 17 new + 66 unchanged - 0 fixed = 83 total (was 66) +1 mvnsite 1m 15s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 50s the patch passed +1 javadoc 0m 51s the patch passed -1 unit 7m 45s hadoop-common in the patch failed. +1 asflicense 0m 28s The patch does not generate ASF License warnings. 60m 34s Reason Tests Failed junit tests hadoop.cli.TestCLI   hadoop.fs.shell.TestCopyFromLocal Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-11786 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12875360/HDFS-11786.003.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux a470da6267f1 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / fa1aaee Default Java 1.8.0_131 findbugs v3.1.0-RC1 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/20129/artifact/patchprocess/diff-checkstyle-hadoop-common-project_hadoop-common.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/20129/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20129/testReport/ modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20129/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Show
          anu Anu Engineer added a comment - Mukul Kumar Singh I have requested for a forced build on this https://builds.apache.org/blue/organizations/jenkins/PreCommit-HDFS-Build/detail/PreCommit-HDFS-Build/20129/pipeline
          Hide
          anu Anu Engineer added a comment -

          +1, pending jenkins.

          Show
          anu Anu Engineer added a comment - +1, pending jenkins.
          Hide
          msingh Mukul Kumar Singh added a comment -

          Anu Engineer Thanks for the review. I have changed the option to "-t". Please have a look again.

          Show
          msingh Mukul Kumar Singh added a comment - Anu Engineer Thanks for the review. I have changed the option to "-t". Please have a look again.
          Hide
          anu Anu Engineer added a comment -

          Mukul Kumar Singh +1, great work. This is really a good change. One small nit: You don't have to fix it, since it is a question of taste. I like my -args to be single char or something that clearly explains the notion. "-nt" seems neither here nor there. Would you please consider it renaming it to "t". Either way, I am +1 on this change, pending jenkins.

          Show
          anu Anu Engineer added a comment - Mukul Kumar Singh +1, great work. This is really a good change. One small nit: You don't have to fix it, since it is a question of taste. I like my -args to be single char or something that clearly explains the notion. "-nt" seems neither here nor there. Would you please consider it renaming it to "t". Either way, I am +1 on this change, pending jenkins.
          Hide
          msingh Mukul Kumar Singh added a comment -

          Thanks for the review Anu Engineer, I have modified the copyfromLocal to make it multithreaded.
          Number of threads is an optional parameter, default value for number of threads is 1.

          This improvement does help in reducing time to copy files drastically, reducing copy time from 14m7s to 3m18s. Please note that the test was done with 12,000 files with random file sizes between 1-10 MB.

          Single threaded put with the put command

          [hdfs@y129 ~]$ time /opt/hadoop/hadoop-3.0.0-alpha4-SNAPSHOT/bin/hdfs dfs -put test /single2
          17/06/30 12:06:48 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
          
          real	14m7.093s
          user	5m48.357s
          sys	1m54.895s
          

          For Multi threaded put with 10 threads using copyFromLocal command

          [hdfs@y129 ~]$ time /opt/hadoop/hadoop-3.0.0-alpha4-SNAPSHOT/bin/hdfs dfs -copyFromLocal -nt 10  test /multi1
          17/06/30 12:24:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
          
          real	3m18.574s
          user	3m42.582s
          sys	1m18.718s
          
          Show
          msingh Mukul Kumar Singh added a comment - Thanks for the review Anu Engineer , I have modified the copyfromLocal to make it multithreaded. Number of threads is an optional parameter, default value for number of threads is 1. This improvement does help in reducing time to copy files drastically, reducing copy time from 14m7s to 3m18s. Please note that the test was done with 12,000 files with random file sizes between 1-10 MB. Single threaded put with the put command [hdfs@y129 ~]$ time /opt/hadoop/hadoop-3.0.0-alpha4-SNAPSHOT/bin/hdfs dfs -put test /single2 17/06/30 12:06:48 WARN util.NativeCodeLoader: Unable to load native -hadoop library for your platform... using builtin-java classes where applicable real 14m7.093s user 5m48.357s sys 1m54.895s For Multi threaded put with 10 threads using copyFromLocal command [hdfs@y129 ~]$ time /opt/hadoop/hadoop-3.0.0-alpha4-SNAPSHOT/bin/hdfs dfs -copyFromLocal -nt 10 test /multi1 17/06/30 12:24:12 WARN util.NativeCodeLoader: Unable to load native -hadoop library for your platform... using builtin-java classes where applicable real 3m18.574s user 3m42.582s sys 1m18.718s
          Hide
          anu Anu Engineer added a comment -

          Mukul Kumar Singh Thanks for the initial patch. I would think that this change can be done in copyFromLocal itself instead of introducing a new command.
          We have to be careful that we don't change the semantics of existing arguments and if new arguments are added they should not be required.
          That will ensure that we don't break existing scripts.

          Show
          anu Anu Engineer added a comment - Mukul Kumar Singh Thanks for the initial patch. I would think that this change can be done in copyFromLocal itself instead of introducing a new command. We have to be careful that we don't change the semantics of existing arguments and if new arguments are added they should not be required. That will ensure that we don't break existing scripts.
          Hide
          msingh Mukul Kumar Singh added a comment -

          Initial patch, I am working on a unit test, will add it in the v2 patch.

          Show
          msingh Mukul Kumar Singh added a comment - Initial patch, I am working on a unit test, will add it in the v2 patch.

            People

            • Assignee:
              msingh Mukul Kumar Singh
              Reporter:
              msingh Mukul Kumar Singh
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development