Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2192

Implement gridmix system tests with different time intervals for MR streaming job traces.

    XMLWordPrintableJSON

Details

    • Task
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • 0.23.0
    • contrib/gridmix
    • None

    Description

      Develop gridmix system tests for below scenarios by using different time intervals of MR streaming jobs.

      1. Generate input data based on cluster size and create the synthetic jobs by using the 2 min folded MR streaming jobs trace and submit the jobs with below arguments.
      GRIDMIX_JOB_TYPE = LOADJOB
      GRIDMIX_USER_RESOLVER = SubmitterUserResolver
      GRIDMIX_SUBMISSION_POLICY = STRESS
      GRIDMIX_JOB_SUBMISSION_QUEUE_IN_TRACE = True
      Input Size = 250 MB * No. of nodes in cluster.
      MINIMUM_FILE_SIZE=150MB
      TRACE_FILE = 2 min folded trace.
      Verify JobStatus for each job, input split size for each job and summary (QueueName, UserName, StatTime, FinishTime, maps, reducers and counters etc) after completion of execution.

      2. Generate input data based on cluster size and create the synthetic jobs by using the 3 min folded MR streaming jobs trace and submit the jobs with below arguments.
      GRIDMIX_JOB_TYPE = LoadJob
      GRIDMIX_USER_RESOLVER = RoundRobinUserResolver
      GRIDMIX_BYTES_PER_FILE = 150 MB
      GRIDMIX_SUBMISSION_POLICY = REPLAY
      GRIDMIX_JOB_SUBMISSION_QUEUE_IN_TRACE = True
      Input Size = 200 MB * No. of nodes in cluster.
      PROXY_USERS = proxy users file path
      TRACE_FILE = 3 min folded trace.
      Verify JobStatus for each job, input split size for each job and summary (QueueName, UserName, StatTime, FinishTime, maps, reducers and counters etc) after completion of execution.

      3. Generate input data based on cluster size and create the synthetic jobs by using the 5 min MR streaming jobs trace and submit the jobs with below arguments.
      GRIDMIX_JOB_TYPE = LoadJob
      GRIDMIX_USER_RESOLVER = SubmitterUserResolver
      GRIDMIX_SUBMISSION_POLICY = SERIAL
      GRIDMIX_JOB_SUBMISSION_QUEUE_IN_TRACE = false
      GRIDMIX_KEY_FRC = 0.5f
      Input Size = 200MB * No. of nodes in cluster.
      TRACE_FILE = 5 min folded trace.
      Verify JobStatus for each job and summary (QueueName, UserName, StatTime, FinishTime, MAPS, REDUCERS and COUNTERS etc) after completion of execution.

      Attachments

        1. MAPREDUCE-2192.patch
          146 kB
          Vinay Kumar Thota
        2. MAPREDUCE-2192.patch
          139 kB
          Vinay Kumar Thota

        Issue Links

          Activity

            People

              vinaythota Vinay Kumar Thota
              vinaythota Vinay Kumar Thota
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: