Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-776

Gridmix: Trace-based benchmark for Map/Reduce

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: benchmarks
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Previous benchmarks ( HADOOP-2369 , HADOOP-3770 ), while informed by production jobs, were principally load generating tools used to validate stability and performance under saturation. The important dimensions of that load- submission order/rate, I/O profile, CPU usage, etc- only accidentally match that of the real load on the cluster. Given related work that characterizes production load ( MAPREDUCE-751 ), it would be worthwhile to use mined data to impose a corresponding load for tuning and guiding development of the framework.

      The first version will focus on modeling task I/O, submission, and memory usage.

      1. MR776-0.patch
        66 kB
        Chris Douglas
      2. M776-5.patch
        109 kB
        Chris Douglas
      3. M776-4.patch
        107 kB
        Chris Douglas
      4. M776-3.patch
        107 kB
        Chris Douglas
      5. M776-2.patch
        109 kB
        Chris Douglas
      6. M776-1.patch
        99 kB
        Chris Douglas

        Issue Links

          Activity

          Hide
          Iyappan Srinivasan added a comment -

          Hi Douglas,

          I have created a Jira MAPREDUCE-988.

          ant package does not copy the
          hadoop-0.21.0-dev-capacity-scheduler.jar under HADOOP_HOME/build/hadoop-mapred-0.21.0-dev/contrib/capacity-scheduler/.

          it was doing it till 15 of September. Since you have a code change in build.xml. can you look at it.

          Show
          Iyappan Srinivasan added a comment - Hi Douglas, I have created a Jira MAPREDUCE-988 . ant package does not copy the hadoop-0.21.0-dev-capacity-scheduler.jar under HADOOP_HOME/build/hadoop-mapred-0.21.0-dev/contrib/capacity-scheduler/. it was doing it till 15 of September. Since you have a code change in build.xml. can you look at it.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #84 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/84/)
          . Add Gridmix, a benchmark processing Rumen traces to simulate
          a measured mix of jobs on a cluster.

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #84 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/84/ ) . Add Gridmix, a benchmark processing Rumen traces to simulate a measured mix of jobs on a cluster.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #42 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/42/)
          . Add Gridmix, a benchmark processing Rumen traces to simulate
          a measured mix of jobs on a cluster.

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #42 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/42/ ) . Add Gridmix, a benchmark processing Rumen traces to simulate a measured mix of jobs on a cluster.
          Hide
          Chris Douglas added a comment -

          I committed this.

          Show
          Chris Douglas added a comment - I committed this.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12419708/M776-5.patch
          against trunk revision 815483.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/84/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/84/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/84/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/84/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12419708/M776-5.patch against trunk revision 815483. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/84/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/84/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/84/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/84/console This message is automatically generated.
          Hide
          Chris Douglas added a comment -

          Missed the Mini*Cluster dependencies

          Show
          Chris Douglas added a comment - Missed the Mini*Cluster dependencies
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12419661/M776-4.patch
          against trunk revision 815249.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/81/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/81/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/81/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/81/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12419661/M776-4.patch against trunk revision 815249. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/81/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/81/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/81/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/81/console This message is automatically generated.
          Hide
          Chris Douglas added a comment -

          Fixed classpath issues on Hudson caused by build file modifications.

          Show
          Chris Douglas added a comment - Fixed classpath issues on Hudson caused by build file modifications.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12419612/M776-3.patch
          against trunk revision 815249.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/79/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/79/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/79/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/79/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12419612/M776-3.patch against trunk revision 815249. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/79/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/79/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/79/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/79/console This message is automatically generated.
          Hide
          Hong Tang added a comment -

          Patch looks good. +1.

          Show
          Hong Tang added a comment - Patch looks good. +1.
          Hide
          Chris Douglas added a comment -

          Updated based on feedback from Hong

          Show
          Chris Douglas added a comment - Updated based on feedback from Hong
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12419603/M776-2.patch
          against trunk revision 814749.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 8 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/77/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/77/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/77/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/77/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12419603/M776-2.patch against trunk revision 814749. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/77/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/77/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/77/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/77/console This message is automatically generated.
          Hide
          Chris Douglas added a comment -

          Patch for review.

          Show
          Chris Douglas added a comment - Patch for review.
          Hide
          Chris Douglas added a comment -

          Updated patch. Processes Rumen job traces, satisfies the preceding save for the memory modeling.

          Show
          Chris Douglas added a comment - Updated patch. Processes Rumen job traces, satisfies the preceding save for the memory modeling.
          Hide
          Chris Douglas added a comment -

          Extremely preliminary shell of a nascent patch-in-progress using mocked Rumen traces for testing. Mostly experiments and scaffolding tagged with TODO items at this point, but the driver works and the I/O is correct through the shuffle within expected tolerances.

          Show
          Chris Douglas added a comment - Extremely preliminary shell of a nascent patch-in-progress using mocked Rumen traces for testing. Mostly experiments and scaffolding tagged with TODO items at this point, but the driver works and the I/O is correct through the shuffle within expected tolerances.
          Hide
          Chris Douglas added a comment -

          Version 1

          The following is a list of features we plan to develop that distinguish it from its predecessors:

          • Workload is generated from job history trace analyzer Rumen ( MAPREDUCE-751 ).
          • Model the details of IO workload:
            • Much larger working set.
            • Input and output data volumes for every task.
            • Input and output record counts for every task.
          • Model the details of memory resource usage.
          • Model the job submission rates

          The main purpose of Gridmix is to evaluate MapReduce and HDFS performance. It will not, and is not intended to, capture improvements to layers on top of MapReduce. As its predecessors were, Gridmix will be principally a job submission client (though collecting high-level metrics- as was attempted in Gridmix2- would be a natural extension) will be developed in several stages. A usable and useful V1.0 of the submitting client must satisfy the following requirements in each functional categories listed below.

          Simplifying Assumptions

          The context for the following list is unpacked in subsequent sections. Briefly, the job properties that will not be closely reproduced:

          • CPU usage. We have no data for per-task CPU usage, so we cannot attempt even an approximation (e.g. tasks should never be CPU bound, though this surely happens in practice).
          • Filesystem properties. We will make no attempt to match block sizes, namespace hierarchies, or any property of input, intermediate, or output data other than the bytes/records consumed and emitted from a given task.
          • I/O rates. Related to CPU usage, the rate at which a given tasks consumes/emits records is assumed to be 1) limited only by the speed of the reader/writer and 2) constant throughout the job.
          • Memory profile. No information on tasks' memory use is available, other than the heap size requested. Given data on the frequency and size of allocations, a more thorough profile could be compiled and reproduced.
          • Skew. We assume records read and output by a task follow observed averages. This is optimistic, as fewer buffers will require resizing, etc. Further, we assume that each map generates a proportional percentage of each reduce's input.
          • Job failure. We assume that framework and hardware faults are the only cause of task failure, i.e. user code is correct.
          • Job independence. The output or outcome of one job does not affect when or whether another job will run.

          Components

          • Data Generation. When run on a cluster without sufficient data- a common case for testing new versions of Hadoop- the submitting client should have a data generation phase to populate the cluster with a configurable volume of data.
            • Reasonable distribution. We do not yet have sufficient data to generate a simulated block map. In the future, we may discover that this requires some processing of/from the Rumen trace to write data similar to that of the actual jobs. For now, we require only that the input data are distributed more-or-less evenly across the cluster, to avoid artifical hot-spots in the Gridmix data.
            • Content. Since the map will employ a trivial binary reader, there are no constraints on the generated data. Future versions of Gridmix could produce compressible or compressed data- or even mock data to be consumed by a non-trival reader- but the only requirement we impose in V1.0 is that the generated data are random (i.e. not trivially compressible).
            • Namespace. The generated data will make no attempt to emulate the block size, file hierarchy, or any other property of the data over which the user jobs originally ran. The generated data will be written into a flat namespace.
          • Generic MR job. Gridmix must have a generic job to effect the workloads in the trace it receives from Rumen. For the first version, our focus will be on reproducing I/O traffic and memory.
            • Task configuration. The number of maps and reduces must match the Rumen trace description.
            • Memory usage. Lacking explicit data on actual memory usage in task JVMs, we assume that each task will consume and use most of the heap allocated to it. This is a pessimistic estimate- since it is expected that most users accept the default setting and use considerably less- but it will be predictable.
            • Map fidelity. Each map will read the number of records and bytes specified in the Rumen trace. It is assumed that the map will output its records at an even rate, so records will be output at a rate from the map- per input record- that will effect the correct number of output records. Without data on the record read rate (and CPU usage), we cannot simulate effort expended per record, but the map itself will do some nominal work. We do not control block size, so if the data read in the original user job had a different block size, then the percentage of remote and local reads will likely be affected.
            • Shuffle fidelity. We do not have sufficient data to describe skewed task output, e.g. a job with more than one reduce, such that at least one reduce receives most of its input from a single map task. Instead, we will assume that each map generates a percentage of bytes and records equal to that received by a given reduce. For example, if reduce r receives i% of the input records and j% of the bytes in the shuffle, then map m will output i% of its output records such that j% of its bytes go to reduce r. This will, with some error, match the input record and byte counts to each reduce. Note that combiners in users jobs will affect byte and record counts we consider, but no combiner will be run. Whether Rumen reports the shuffled bytes/records as the map output- or the bytes output from the map prior to the combiner- will not be distinguished by the Gridmix submission client.
            • Reduce fidelity. Each reduce will- by definition- receive the set of bytes and records assigned to it from each map. As in the map, it will output at a constant rate, relative to the input records, such that it will output the correct number of records and bytes as described in the Rumen trace.
          • Driver Program.
            • Input parameters. The Gridmix client will accept a Rumen trace (either from a URI or stdin) of jobs, work directory, and (optional) the size of the data to be generated. One data generating task will be started per TT node; each will generate the same amount of data.
            • Submission. Given that Gridmix is simulating a set of submitting clients, it must be possible to submit jobs concurrently in case split generation would otherwise cause a deadline to be missed. (Un)Fortunately, scalability limitations at the JobTracker impose an upper bound on the submission rate the client can effect. Errors in submission must be reported at the client.
            • Cancellation and shutdown. It should be possible to cancel a run of Gridmix at any time without leaving jobs running on the cluster. Gridmix must make a best-effort to kill any running jobs it has started before exiting through a shutdown hook.
            • Monitoring. Though V1.0 will not attempt to monitor job dependencies, and will merely submit each job at its appointed deadline, Gridmix must monitor the jobs it has submitted and perform success/failure callbacks for every completed job. For V1.0, it is assumed that Rumen will provide all necessary intelligence for processing the run at the JobTracker.

          Future Work

          There are a number of ideas and approaches we elect to defer until either the need is demonstrated or more time is available for implementation. Among them:

          • Job throttling. Rather than attempt any sort of backoff at the client in response to JobTracker load, we take the position that a trace is an inviolate submission profile. This allows us to make comparisons between two runs of the same trace. However, if this is to replace GridMix and GridMix2 as the de facto load generator or if one wants to throttle submission on evidence that it interferes with some unrelated measurement, such throttling may become attractive.
          • Compression. The compressibility of input, intermediate, and output data can be mined from task logs. It would be possible to generate data that would roughly matched the observed "compressibility" of user jobs, but this would require a targeted effort.
          • Designed jobs. Given that MapReduce has a range of widely used libraries- whose development is a considerable portion of JIRA traffic- tracking improvements to this core functionality in proportion to their prevalence in user jobs would aid in identifying bottlenecks in the application layer. Support for specifying InputFormats, OutputFormats, and even commonly used Mappers and Reducers would augment the synthetic mix.
          • Failure. Gridmix assumes that failures are caused by hardware or flaws in the framework, so it makes no attempt to inject failures into a particular run. Since this does not match some of the preliminary insights into the causes of task failure, it may be argued that the Rumen trace should include- and Gridmix should honor- application failures. Tracing where in the input the failure occurs- and which component fails- would be a way to characterize and measure approaches to mitigating resource waste. In particular, one may improve the proportion of "useful" work- and thus throughput- with a better strategy for punishing/throttling jobs that will ultimately fail.
          • Job dependencies. Modeling dependencies between jobs would remove a lower bound on job throughput by starting dependent jobs immediately once their prerequisites have been met. In general, one can imagine a submission client that would accept a list of prerequisites of which a submission time is merely the most common.
          • Isolation. We assume that Gridmix "owns" the cluster it's running on, i.e. the only jobs running on the cluster are those submitted by an instance of the client. Supporting load generation alongside user tasks would be another use for throttling job submission (and a possible alternative to very fine simulations of specific user tasks).
          Show
          Chris Douglas added a comment - Version 1 The following is a list of features we plan to develop that distinguish it from its predecessors: Workload is generated from job history trace analyzer Rumen ( MAPREDUCE-751 ). Model the details of IO workload: Much larger working set. Input and output data volumes for every task. Input and output record counts for every task. Model the details of memory resource usage. Model the job submission rates The main purpose of Gridmix is to evaluate MapReduce and HDFS performance. It will not, and is not intended to, capture improvements to layers on top of MapReduce. As its predecessors were, Gridmix will be principally a job submission client (though collecting high-level metrics- as was attempted in Gridmix2- would be a natural extension) will be developed in several stages. A usable and useful V1.0 of the submitting client must satisfy the following requirements in each functional categories listed below. Simplifying Assumptions The context for the following list is unpacked in subsequent sections. Briefly, the job properties that will not be closely reproduced: CPU usage . We have no data for per-task CPU usage, so we cannot attempt even an approximation (e.g. tasks should never be CPU bound, though this surely happens in practice). Filesystem properties. We will make no attempt to match block sizes, namespace hierarchies, or any property of input, intermediate, or output data other than the bytes/records consumed and emitted from a given task. I/O rates. Related to CPU usage, the rate at which a given tasks consumes/emits records is assumed to be 1) limited only by the speed of the reader/writer and 2) constant throughout the job. Memory profile. No information on tasks' memory use is available, other than the heap size requested. Given data on the frequency and size of allocations, a more thorough profile could be compiled and reproduced. Skew. We assume records read and output by a task follow observed averages. This is optimistic, as fewer buffers will require resizing, etc. Further, we assume that each map generates a proportional percentage of each reduce's input. Job failure. We assume that framework and hardware faults are the only cause of task failure, i.e. user code is correct. Job independence. The output or outcome of one job does not affect when or whether another job will run. Components Data Generation. When run on a cluster without sufficient data- a common case for testing new versions of Hadoop- the submitting client should have a data generation phase to populate the cluster with a configurable volume of data. Reasonable distribution. We do not yet have sufficient data to generate a simulated block map. In the future, we may discover that this requires some processing of/from the Rumen trace to write data similar to that of the actual jobs. For now, we require only that the input data are distributed more-or-less evenly across the cluster, to avoid artifical hot-spots in the Gridmix data. Content. Since the map will employ a trivial binary reader, there are no constraints on the generated data. Future versions of Gridmix could produce compressible or compressed data- or even mock data to be consumed by a non-trival reader- but the only requirement we impose in V1.0 is that the generated data are random (i.e. not trivially compressible). Namespace. The generated data will make no attempt to emulate the block size, file hierarchy, or any other property of the data over which the user jobs originally ran. The generated data will be written into a flat namespace. Generic MR job. Gridmix must have a generic job to effect the workloads in the trace it receives from Rumen. For the first version, our focus will be on reproducing I/O traffic and memory. Task configuration. The number of maps and reduces must match the Rumen trace description. Memory usage. Lacking explicit data on actual memory usage in task JVMs, we assume that each task will consume and use most of the heap allocated to it. This is a pessimistic estimate- since it is expected that most users accept the default setting and use considerably less- but it will be predictable. Map fidelity. Each map will read the number of records and bytes specified in the Rumen trace. It is assumed that the map will output its records at an even rate, so records will be output at a rate from the map- per input record- that will effect the correct number of output records. Without data on the record read rate (and CPU usage), we cannot simulate effort expended per record, but the map itself will do some nominal work. We do not control block size, so if the data read in the original user job had a different block size, then the percentage of remote and local reads will likely be affected. Shuffle fidelity. We do not have sufficient data to describe skewed task output, e.g. a job with more than one reduce, such that at least one reduce receives most of its input from a single map task. Instead, we will assume that each map generates a percentage of bytes and records equal to that received by a given reduce. For example, if reduce r receives i% of the input records and j% of the bytes in the shuffle, then map m will output i% of its output records such that j% of its bytes go to reduce r. This will, with some error, match the input record and byte counts to each reduce. Note that combiners in users jobs will affect byte and record counts we consider, but no combiner will be run. Whether Rumen reports the shuffled bytes/records as the map output- or the bytes output from the map prior to the combiner- will not be distinguished by the Gridmix submission client. Reduce fidelity. Each reduce will- by definition- receive the set of bytes and records assigned to it from each map. As in the map, it will output at a constant rate, relative to the input records, such that it will output the correct number of records and bytes as described in the Rumen trace. Driver Program. Input parameters. The Gridmix client will accept a Rumen trace (either from a URI or stdin) of jobs, work directory, and (optional) the size of the data to be generated. One data generating task will be started per TT node; each will generate the same amount of data. Submission. Given that Gridmix is simulating a set of submitting clients, it must be possible to submit jobs concurrently in case split generation would otherwise cause a deadline to be missed. (Un)Fortunately, scalability limitations at the JobTracker impose an upper bound on the submission rate the client can effect. Errors in submission must be reported at the client. Cancellation and shutdown. It should be possible to cancel a run of Gridmix at any time without leaving jobs running on the cluster. Gridmix must make a best-effort to kill any running jobs it has started before exiting through a shutdown hook. Monitoring. Though V1.0 will not attempt to monitor job dependencies, and will merely submit each job at its appointed deadline, Gridmix must monitor the jobs it has submitted and perform success/failure callbacks for every completed job. For V1.0, it is assumed that Rumen will provide all necessary intelligence for processing the run at the JobTracker. Future Work There are a number of ideas and approaches we elect to defer until either the need is demonstrated or more time is available for implementation. Among them: Job throttling. Rather than attempt any sort of backoff at the client in response to JobTracker load, we take the position that a trace is an inviolate submission profile. This allows us to make comparisons between two runs of the same trace. However, if this is to replace GridMix and GridMix2 as the de facto load generator or if one wants to throttle submission on evidence that it interferes with some unrelated measurement, such throttling may become attractive. Compression. The compressibility of input, intermediate, and output data can be mined from task logs. It would be possible to generate data that would roughly matched the observed "compressibility" of user jobs, but this would require a targeted effort. Designed jobs. Given that MapReduce has a range of widely used libraries- whose development is a considerable portion of JIRA traffic- tracking improvements to this core functionality in proportion to their prevalence in user jobs would aid in identifying bottlenecks in the application layer. Support for specifying InputFormats, OutputFormats, and even commonly used Mappers and Reducers would augment the synthetic mix. Failure. Gridmix assumes that failures are caused by hardware or flaws in the framework, so it makes no attempt to inject failures into a particular run. Since this does not match some of the preliminary insights into the causes of task failure, it may be argued that the Rumen trace should include- and Gridmix should honor- application failures. Tracing where in the input the failure occurs- and which component fails- would be a way to characterize and measure approaches to mitigating resource waste. In particular, one may improve the proportion of "useful" work- and thus throughput- with a better strategy for punishing/throttling jobs that will ultimately fail. Job dependencies. Modeling dependencies between jobs would remove a lower bound on job throughput by starting dependent jobs immediately once their prerequisites have been met. In general, one can imagine a submission client that would accept a list of prerequisites of which a submission time is merely the most common. Isolation. We assume that Gridmix "owns" the cluster it's running on, i.e. the only jobs running on the cluster are those submitted by an instance of the client. Supporting load generation alongside user tasks would be another use for throttling job submission (and a possible alternative to very fine simulations of specific user tasks).

            People

            • Assignee:
              Chris Douglas
              Reporter:
              Chris Douglas
            • Votes:
              0 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development