Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2192

Implement gridmix system tests with different time intervals for MR streaming job traces.

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Task
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • 0.23.0
    • contrib/gridmix
    • None

    Description

      Develop gridmix system tests for below scenarios by using different time intervals of MR streaming jobs.

      1. Generate input data based on cluster size and create the synthetic jobs by using the 2 min folded MR streaming jobs trace and submit the jobs with below arguments.
      GRIDMIX_JOB_TYPE = LOADJOB
      GRIDMIX_USER_RESOLVER = SubmitterUserResolver
      GRIDMIX_SUBMISSION_POLICY = STRESS
      GRIDMIX_JOB_SUBMISSION_QUEUE_IN_TRACE = True
      Input Size = 250 MB * No. of nodes in cluster.
      MINIMUM_FILE_SIZE=150MB
      TRACE_FILE = 2 min folded trace.
      Verify JobStatus for each job, input split size for each job and summary (QueueName, UserName, StatTime, FinishTime, maps, reducers and counters etc) after completion of execution.

      2. Generate input data based on cluster size and create the synthetic jobs by using the 3 min folded MR streaming jobs trace and submit the jobs with below arguments.
      GRIDMIX_JOB_TYPE = LoadJob
      GRIDMIX_USER_RESOLVER = RoundRobinUserResolver
      GRIDMIX_BYTES_PER_FILE = 150 MB
      GRIDMIX_SUBMISSION_POLICY = REPLAY
      GRIDMIX_JOB_SUBMISSION_QUEUE_IN_TRACE = True
      Input Size = 200 MB * No. of nodes in cluster.
      PROXY_USERS = proxy users file path
      TRACE_FILE = 3 min folded trace.
      Verify JobStatus for each job, input split size for each job and summary (QueueName, UserName, StatTime, FinishTime, maps, reducers and counters etc) after completion of execution.

      3. Generate input data based on cluster size and create the synthetic jobs by using the 5 min MR streaming jobs trace and submit the jobs with below arguments.
      GRIDMIX_JOB_TYPE = LoadJob
      GRIDMIX_USER_RESOLVER = SubmitterUserResolver
      GRIDMIX_SUBMISSION_POLICY = SERIAL
      GRIDMIX_JOB_SUBMISSION_QUEUE_IN_TRACE = false
      GRIDMIX_KEY_FRC = 0.5f
      Input Size = 200MB * No. of nodes in cluster.
      TRACE_FILE = 5 min folded trace.
      Verify JobStatus for each job and summary (QueueName, UserName, StatTime, FinishTime, MAPS, REDUCERS and COUNTERS etc) after completion of execution.

      Attachments

        1. MAPREDUCE-2192.patch
          139 kB
          Vinay Kumar Thota
        2. MAPREDUCE-2192.patch
          146 kB
          Vinay Kumar Thota

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            vinaythota Vinay Kumar Thota
            vinaythota Vinay Kumar Thota
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment