Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3561 [Umbrella ticket] Performance issues in YARN+MR
  3. MAPREDUCE-3812

Lower default allocation sizes, fix allocation configurations and document them

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.23.0
    • 0.23.3, 2.0.2-alpha
    • mrv2, performance
    • None
    • Incompatible change
    • Hide
      Removes two sets of previously available config properties:

      1. ( yarn.scheduler.fifo.minimum-allocation-mb and yarn.scheduler.fifo.maximum-allocation-mb ) and,
      2. ( yarn.scheduler.capacity.minimum-allocation-mb and yarn.scheduler.capacity.maximum-allocation-mb )

      In favor of two new, generically named properties:

      1. yarn.scheduler.minimum-allocation-mb - This acts as the floor value of memory resource requests for containers.
      2. yarn.scheduler.maximum-allocation-mb - This acts as the ceiling value of memory resource requests for containers.

      Both these properties need to be set at the ResourceManager (RM) to take effect, as the RM is where the scheduler resides.

      Also changes the default minimum and maximums to 128 MB and 10 GB respectively.
      Show
      Removes two sets of previously available config properties: 1. ( yarn.scheduler.fifo.minimum-allocation-mb and yarn.scheduler.fifo.maximum-allocation-mb ) and, 2. ( yarn.scheduler.capacity.minimum-allocation-mb and yarn.scheduler.capacity.maximum-allocation-mb ) In favor of two new, generically named properties: 1. yarn.scheduler.minimum-allocation-mb - This acts as the floor value of memory resource requests for containers. 2. yarn.scheduler.maximum-allocation-mb - This acts as the ceiling value of memory resource requests for containers. Both these properties need to be set at the ResourceManager (RM) to take effect, as the RM is where the scheduler resides. Also changes the default minimum and maximums to 128 MB and 10 GB respectively.

    Description

      After a few performance improvements tracked at MAPREDUCE-3561, like MAPREDUCE-3511 and MAPREDUCE-3567, even a 100K maps job can also run within 1GB vmem. We earlier increased AM slot size from 1 slot to two slots to work around the issues with AM heap. Now that those are fixed, we should go back to 1GB.

      This is just a configuration change.

      [P.s.]:

      • Currently min/max alloc is set at a per-scheduler config level, which makes no sense as there's no way to run multiple schedulers anyway. Switch configs to use a generic RM-config.
      • The min/max alloc configs aren't documented and we ought to document it (i.e. MAPREDUCE-4027)
      • 1 GB is perhaps too high for a slot's minimum. While job defaults can be left at such values, we should lower the minimum alloc to 128 MB to allow special requests of low memory out of the box itself. Shouldn't impact MR App in any way.

      Attachments

        1. MAPREDUCE-3812.patch
          21 kB
          Harsh J
        2. MAPREDUCE-3812.patch
          20 kB
          Harsh J
        3. MAPREDUCE-3812.patch
          21 kB
          Harsh J
        4. MAPREDUCE-3812.patch
          25 kB
          Arun Murthy
        5. MAPREDUCE-3812.patch
          21 kB
          Arun Murthy
        6. MAPREDUCE-3812-20120205.txt
          10 kB
          Vinod Kumar Vavilapalli
        7. MAPREDUCE-3812-20120206.1.txt
          19 kB
          Vinod Kumar Vavilapalli
        8. MAPREDUCE-3812-20120206.txt
          10 kB
          Vinod Kumar Vavilapalli

        Issue Links

          Activity

            People

              qwertymaniac Harsh J
              vinodkv Vinod Kumar Vavilapalli
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: