Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3182

loadgen ignores -m command line when writing random data

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 0.23.0, 2.3.0
    • Fix Version/s: None
    • Component/s: documentation, mrv2, test
    • Labels:
    • Target Version/s:

      Description

      If no input directories are specified, loadgen goes into a special mode where random data is generated and written. In that mode, setting the number of mappers (-m command line option) is overridden by a calculation. Instead, it should take into consideration the user specified number of mappers and fall back to the calculation. In addition, update the documentation as well to match the new behavior in the code.

        Activity

        Hide
        Chen He added a comment -

        I will take look at this issue.

        Show
        Chen He added a comment - I will take look at this issue.
        Hide
        Chen He added a comment -

        There two GenericLoadGenerator classes in current Hadoop source code.
        One is under org.apache.hadoop.mapreduce package. It has two documentation problems. Firstly, it does not actually parse the "-m" command line option but still show this option in the "Usage". Secondly, if user does not specify the input directory, it will create input data using RandomWriter with default setting( 10GB per map task and 10 map task per node). However, it does not show this option in the "Usage".

        The other is under org.apache.hadoop.mapred package; It is an older version of GenericLoadGenerator. It has the second documentation problem described in above paragraph.

        Show
        Chen He added a comment - There two GenericLoadGenerator classes in current Hadoop source code. One is under org.apache.hadoop.mapreduce package. It has two documentation problems. Firstly, it does not actually parse the "-m" command line option but still show this option in the "Usage". Secondly, if user does not specify the input directory, it will create input data using RandomWriter with default setting( 10GB per map task and 10 map task per node). However, it does not show this option in the "Usage". The other is under org.apache.hadoop.mapred package; It is an older version of GenericLoadGenerator. It has the second documentation problem described in above paragraph.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12640090/MAPREDUCE-3182.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4512//testReport/
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4512//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12640090/MAPREDUCE-3182.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4512//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4512//console This message is automatically generated.
        Hide
        Chen He added a comment -

        Hi Jonathan Eagles, would you mind take a look of this patch. Thank you very much!

        Show
        Chen He added a comment - Hi Jonathan Eagles , would you mind take a look of this patch. Thank you very much!
        Hide
        Hadoop QA added a comment -



        -1 overall



        Vote Subsystem Runtime Comment
        0 pre-patch 5m 18s Pre-patch trunk compilation is healthy.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 tests included 0m 0s The patch appears to include 2 new or modified test files.
        +1 javac 7m 41s There were no new javac warning messages.
        +1 release audit 0m 19s The applied patch does not increase the total number of release audit warnings.
        +1 checkstyle 0m 34s There were no new checkstyle issues.
        +1 whitespace 0m 0s The patch has no lines that end in whitespace.
        +1 install 1m 34s mvn install still works.
        +1 eclipse:eclipse 0m 31s The patch built with eclipse:eclipse.
        +1 findbugs 0m 42s The patch does not introduce any new Findbugs (version 2.0.3) warnings.
        -1 mapreduce tests 97m 2s Tests failed in hadoop-mapreduce-client-jobclient.
            113m 44s  



        Reason Tests
        Failed unit tests hadoop.mapreduce.v2.TestMRJobsWithProfiler
          hadoop.mapred.TestMRIntermediateDataEncryption



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12640090/MAPREDUCE-3182.patch
        Optional Tests javac unit findbugs checkstyle
        git revision trunk / 6ae2a0d
        hadoop-mapreduce-client-jobclient test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5606/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt
        Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5606/testReport/
        Java 1.7.0_55
        uname Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5606/console

        This message was automatically generated.

        Show
        Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 5m 18s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 2 new or modified test files. +1 javac 7m 41s There were no new javac warning messages. +1 release audit 0m 19s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 34s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 34s mvn install still works. +1 eclipse:eclipse 0m 31s The patch built with eclipse:eclipse. +1 findbugs 0m 42s The patch does not introduce any new Findbugs (version 2.0.3) warnings. -1 mapreduce tests 97m 2s Tests failed in hadoop-mapreduce-client-jobclient.     113m 44s   Reason Tests Failed unit tests hadoop.mapreduce.v2.TestMRJobsWithProfiler   hadoop.mapred.TestMRIntermediateDataEncryption Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12640090/MAPREDUCE-3182.patch Optional Tests javac unit findbugs checkstyle git revision trunk / 6ae2a0d hadoop-mapreduce-client-jobclient test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5606/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5606/testReport/ Java 1.7.0_55 uname Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5606/console This message was automatically generated.
        Hide
        Akira AJISAKA added a comment -

        Mostly looks good to me. Two comments:
        1. For mapred/GenericMRLoadGenerator, would you document that the number of map tasks specified by -m option is overridden when -indir is not specified?
        2.

        +    "RandomWriter will be used to create input directory and data if \"-indir\"" +
        
        +    "RandomWriter will be used to create input directory and data if [-indir]" +
        

        Is there any reason to use different form for -indir?

        Show
        Akira AJISAKA added a comment - Mostly looks good to me. Two comments: 1. For mapred/GenericMRLoadGenerator, would you document that the number of map tasks specified by -m option is overridden when -indir is not specified? 2. + "RandomWriter will be used to create input directory and data if \" -indir\"" + + "RandomWriter will be used to create input directory and data if [-indir]" + Is there any reason to use different form for -indir?
        Hide
        Chen He added a comment -

        Thank you for the review Akira AJISAKA. I will update this weekend.

        Show
        Chen He added a comment - Thank you for the review Akira AJISAKA . I will update this weekend.

          People

          • Assignee:
            Chen He
            Reporter:
            Jonathan Eagles
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Development