Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5947

Map phase merge can better utilize memory

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.4.0
    • None
    • performance, task

    Description

      Map phase merge reads spills from disk and writes intermediate results back to disk, and so on. I think it is possible to use memory to store intermediate results, thereby reducing disk IO. Because kvbuffer is nullified right before merge, we have at least io.sort.mb amount of heap available.
      MAPREDUCE-4511 can be considered as an effort to utilize memory better through read ahead, but number of disk IO is unchanged.

      Please give me your thoughts. I'd like to take up this issue.

      Attachments

        Activity

          People

            jaehoon13.ko jaehoon ko
            jaehoon13.ko jaehoon ko
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: