Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2153

Bring in more job configuration properties in to the trace file

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.23.0
    • Fix Version/s: 0.23.0
    • Component/s: tools/rumen
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Adds job configuration parameters to the job trace. The configuration parameters are stored under the 'jobProperties' field as key-value pairs.
    • Tags:
      rumen, job-conf, job-properties

      Description

      To emulate distributed cache usage in gridmix jobs, there are 9 configuration properties needed to be available in trace file:
      (1) mapreduce.job.cache.files
      (2) mapreduce.job.cache.files.visibilities
      (3) mapreduce.job.cache.files.filesizes
      (4) mapreduce.job.cache.files.timestamps

      (5) mapreduce.job.cache.archives
      (6) mapreduce.job.cache.archives.visibilities
      (7) mapreduce.job.cache.archives.filesizes
      (8) mapreduce.job.cache.archives.timestamps

      (9) mapreduce.job.cache.symlink.create

      To emulate data compression in gridmix jobs, trace file should contain the following configuration properties:
      (1) mapreduce.map.output.compress
      (2) mapreduce.map.output.compress.codec
      (3) mapreduce.output.fileoutputformat.compress
      (4) mapreduce.output.fileoutputformat.compress.codec
      (5) mapreduce.output.fileoutputformat.compress.type

      Ideally, gridmix should set many job specific configuration properties like io.sort.mb, io.sort.factor, etc when running simulated jobs to get the same effect of original/real job in terms of spilled records, number of merges, etc.

      TraceBuilder should bring in all these properties into the generated trace file.

      1. MapReduce-2153-trunk.patch
        130 kB
        Rajesh Balamohan
      2. MapReduce-2153-trunk.patch
        130 kB
        Rajesh Balamohan
      3. MR-2153-patch.txt
        128 kB
        Rajesh Balamohan
      4. mr-2153-test-patch-results.txt
        213 kB
        Rajesh Balamohan

        Issue Links

          Activity

          Hide
          Ravi Gummadi added a comment -

          Some job specific configuration properties(but not complete) mentioned in MAPREDUCE-1658 that are to be brought into trace file are:

          • mapreduce.map.speculative, mapreduce.reduce.speculative
          • mapreduce.job.reduce.slowstart.completedmaps
          • mapreduce.task.io.sort.factor, mapreduce.task.io.sort.mb, mapreduce.map.sort.spill.percent
          • mapreduce.reduce.shuffle.connect.timeout, mapreduce.reduce.shuffle.read.timeout
          • mapreduce.reduce.shuffle.merge.percent, mapreduce.reduce.shuffle.input.buffer.percent
          • mapreduce.reduce.merge.inmem.threshold

          Resolved MAPREDUCE-1658 as duplicate of this JIRA.

          Show
          Ravi Gummadi added a comment - Some job specific configuration properties(but not complete) mentioned in MAPREDUCE-1658 that are to be brought into trace file are: mapreduce.map.speculative, mapreduce.reduce.speculative mapreduce.job.reduce.slowstart.completedmaps mapreduce.task.io.sort.factor, mapreduce.task.io.sort.mb, mapreduce.map.sort.spill.percent mapreduce.reduce.shuffle.connect.timeout, mapreduce.reduce.shuffle.read.timeout mapreduce.reduce.shuffle.merge.percent, mapreduce.reduce.shuffle.input.buffer.percent mapreduce.reduce.merge.inmem.threshold Resolved MAPREDUCE-1658 as duplicate of this JIRA.
          Hide
          Rajesh Balamohan added a comment -

          Attaching the patch for apache trunk. This patch ensures that all job properties are saved in the json file under "jobProperties" tag.

          Show
          Rajesh Balamohan added a comment - Attaching the patch for apache trunk. This patch ensures that all job properties are saved in the json file under "jobProperties" tag.
          Hide
          Rajesh Balamohan added a comment -

          ant test-patch results

          findbugs are not related to this patch.

          Show
          Rajesh Balamohan added a comment - ant test-patch results findbugs are not related to this patch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12475671/mr-2153-test-patch-results.txt
          against trunk revision 1090390.

          -1 @author. The patch appears to contain 3 @author tags which the Hadoop community has agreed to not allow in code contributions.

          +1 tests included. The patch appears to include 503 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/164//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12475671/mr-2153-test-patch-results.txt against trunk revision 1090390. -1 @author. The patch appears to contain 3 @author tags which the Hadoop community has agreed to not allow in code contributions. +1 tests included. The patch appears to include 503 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/164//console This message is automatically generated.
          Hide
          Rajesh Balamohan added a comment -

          Uploading the same patch for running via Hudson

          Show
          Rajesh Balamohan added a comment - Uploading the same patch for running via Hudson
          Hide
          Amar Kamat added a comment -

          Cancelling as Hudson picked up the wrong file.

          Show
          Amar Kamat added a comment - Cancelling as Hudson picked up the wrong file.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12476085/MapReduce-2153-trunk.patch
          against trunk revision 1090390.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 15 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The applied patch generated 2247 javac compiler warnings (more than the trunk's current 2244 warnings).

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/165//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/165//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/165//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12476085/MapReduce-2153-trunk.patch against trunk revision 1090390. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 2247 javac compiler warnings (more than the trunk's current 2244 warnings). +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/165//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/165//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/165//console This message is automatically generated.
          Hide
          Rajesh Balamohan added a comment -

          Fixed the javac warnings in earlier patch

          Show
          Rajesh Balamohan added a comment - Fixed the javac warnings in earlier patch
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12476206/MR-2153-patch.txt
          against trunk revision 1090390.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 15 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/166//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/166//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/166//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12476206/MR-2153-patch.txt against trunk revision 1090390. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/166//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/166//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/166//console This message is automatically generated.
          Hide
          Amar Kamat added a comment -

          The two testcases that failed are
          1. org.apache.hadoop.mapred.TestFairSchedulerSystem.testFairSchedulerSystem
          2. org.apache.hadoop.raid.TestBlockFixerDistConcurrency.testConcurrentJobs

          They are not related to this patch. I will go ahead and commit this.

          Show
          Amar Kamat added a comment - The two testcases that failed are 1. org.apache.hadoop.mapred.TestFairSchedulerSystem.testFairSchedulerSystem 2. org.apache.hadoop.raid.TestBlockFixerDistConcurrency.testConcurrentJobs They are not related to this patch. I will go ahead and commit this.
          Hide
          Ravi Gummadi added a comment -

          Marking those methods deprecated and removing/modifying the calls to those methods seems to be a better way — for the issues identified by the "javac warnings".

          Show
          Ravi Gummadi added a comment - Marking those methods deprecated and removing/modifying the calls to those methods seems to be a better way — for the issues identified by the "javac warnings".
          Hide
          Ravi Gummadi added a comment -

          Discussed offline with Amar. There seems to be no way to mark those methods as deprecated and remove references and not break backwards compatibility. Backward compatibility in the sense that supporting older trace files which contain fields and don't contain JobProperties.

          So let us go ahead keeping those existing setters and getters as they are for some time(few releases) and make sure that more setters and getters are not added from now on.

          +1 for the latest patch.

          Show
          Ravi Gummadi added a comment - Discussed offline with Amar. There seems to be no way to mark those methods as deprecated and remove references and not break backwards compatibility. Backward compatibility in the sense that supporting older trace files which contain fields and don't contain JobProperties. So let us go ahead keeping those existing setters and getters as they are for some time(few releases) and make sure that more setters and getters are not added from now on. +1 for the latest patch.
          Hide
          Amar Kamat added a comment -

          I just committed this. Thanks Rajesh and Ravi!

          Show
          Amar Kamat added a comment - I just committed this. Thanks Rajesh and Ravi!

            People

            • Assignee:
              Rajesh Balamohan
              Reporter:
              Ravi Gummadi
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development