Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5785

Derive heap size or mapreduce.*.memory.mb automatically

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.0.0-alpha1
    • Component/s: mr-am, task
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Incompatible change
    • Release Note:
      Hide
      The memory values for mapreduce.map/reduce.memory.mb keys, if left to their default values of -1, will now be automatically inferred from the heap size value system property (-Xmx) specified for mapreduce.map/reduce.java.opts keys.

      The converse is also done, i.e. if mapreduce.map/reduce.memory.mb values are specified, but no -Xmx is supplied for mapreduce.map/reduce.java.opts keys, then the -Xmx value will be derived from the former's value.

      If neither is specified, then a default value of 1024 MB gets used.

      For both these conversions, a scaling factor specified by property mapreduce.job.heap.memory-mb.ratio is used, to account for overheads between heap usage vs. actual physical memory usage.

      Existing configs or job code that already specify both the set of properties explicitly would not be affected by this inferring change.
      Show
      The memory values for mapreduce.map/reduce.memory.mb keys, if left to their default values of -1, will now be automatically inferred from the heap size value system property (-Xmx) specified for mapreduce.map/reduce.java.opts keys. The converse is also done, i.e. if mapreduce.map/reduce.memory.mb values are specified, but no -Xmx is supplied for mapreduce.map/reduce.java.opts keys, then the -Xmx value will be derived from the former's value. If neither is specified, then a default value of 1024 MB gets used. For both these conversions, a scaling factor specified by property mapreduce.job.heap.memory-mb.ratio is used, to account for overheads between heap usage vs. actual physical memory usage. Existing configs or job code that already specify both the set of properties explicitly would not be affected by this inferring change.

      Description

      Currently users have to set 2 memory-related configs per Job / per task type. One first chooses some container size map reduce.*.memory.mb and then a corresponding maximum Java heap size Xmx < map reduce.*.memory.mb. This makes sure that the JVM's C-heap (native memory + Java heap) does not exceed this mapreduce.*.memory.mb. If one forgets to tune Xmx, MR-AM might be

      • allocating big containers whereas the JVM will only use the default -Xmx200m.
      • allocating small containers that will OOM because Xmx is too high.

      With this JIRA, we propose to set Xmx automatically based on an empirical ratio that can be adjusted. Xmx is not changed automatically if provided by the user.

        Attachments

        1. MAPREDUCE-5785.v01.patch
          10 kB
          Gera Shegalov
        2. MAPREDUCE-5785.v02.patch
          26 kB
          Gera Shegalov
        3. MAPREDUCE-5785.v03.patch
          25 kB
          Gera Shegalov
        4. mr-5785-4.patch
          25 kB
          Karthik Kambatla
        5. mr-5785-5.patch
          24 kB
          Karthik Kambatla
        6. mr-5785-6.patch
          24 kB
          Karthik Kambatla
        7. mr-5785-7.patch
          23 kB
          Karthik Kambatla
        8. mr-5785-8.patch
          23 kB
          Karthik Kambatla
        9. mr-5785-9.patch
          23 kB
          Karthik Kambatla

          Issue Links

            Activity

              People

              • Assignee:
                jira.shegalov Gera Shegalov
                Reporter:
                jira.shegalov Gera Shegalov
              • Votes:
                0 Vote for this issue
                Watchers:
                24 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: