Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-329

eliminate parameters that must change with cluster size

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      As far as possible, configurations should be independent of cluster size. The cluster size is a parameter that the system knows and should thus be used to size things accordingly. Currently we know that the following parameters must be adjusted to non-default values for large clusters:

      • mapred.job.tracker.handler.count
      • mapred.reduce.parallel.copies
      • tasktracker.http.threads

      We should attempt to make each of these either set to something proportional to cluster size or (harder) dynamically sized based on load.

      Attachments

        Activity

          People

            Unassigned Unassigned
            cutting Doug Cutting
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: