Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3788

[Gridmix] Investigate if Gridmix can be made YARN aware

Add voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.0.0-alpha1
    • None
    • contrib/gridmix
    • gridmix yarn

    Description

      Gridmix was written keeping in mind the monolithic JobTracker. Calls to the single JobTracker were throttled to avoid excess load. Also, polling was faster in JobTracker as the job statuses were cached even if the job was complete. In the YARN world, the situation is slightly different. To make Gridmix scalable and really a YARN scale-benchmarking tool, Gridmix should be enhanced. Some directions worth investigating are:
      1. Investigate if Gridmix can cache the AM handles and poll the AM directly for map/reduce task progress.
      2. Can the job monitor be made multi-threaded? Each thread can poll a bunch of AMs.
      3. Check if there are better ways for getting job progress updates and get away with the busy-waiting logic in Gridmix.
      4. Can Gridmix be made container aware. The definition of cluster load should be container aware.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            amar_kamat Amar Kamat

            Dates

              Created:
              Updated:

              Slack

                Issue deployment