Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6551

Dynamic adjust mapTaskAttempt memory size

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • 2.7.1
    • None
    • task
    • None

    Description

      I found a scenario that the map tasks cost so much resource of cluster.This scenario will be happened that if there are many small file blokcs (even some are not reach 1M),and this will lead to many map task to read.And in gengeral,a map task attempt will use the default config MRJobConfig#MAP_MEMORY_MB to set its resourceCapcity's memory to deal with their datas.And this will cause a problem that map tasks cost so much memory resource and target data is small.So I have a idea that wherther we can dynamic set mapTaskAttempt memory size by its inputDataLength.And this value can be provided by TaskSplitMetaInfo#getInputDataLength methods.Besides that,we should provided a standard unit dataLength for a standard memory size.

      Attachments

        1. MAPREDUCE-6551.001.patch
          8 kB
          Yiqun Lin

        Activity

          People

            linyiqun Yiqun Lin
            linyiqun Yiqun Lin
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: