Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-64

Map-side sort is hampered by io.sort.record.percent

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: performance, task
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Currently io.sort.record.percent is a fairly obscure, per-job configurable, expert-level parameter which controls how much accounting space is available for records in the map-side sort buffer (io.sort.mb). Typically values for io.sort.mb (100) and io.sort.record.percent (0.05) imply that we can store ~350,000 records in the buffer before necessitating a sort/combine/spill.

      However for many applications which deal with small records e.g. the world-famous wordcount and it's family this implies we can only use 5-10% of io.sort.mb i.e. (5-10M) before we spill inspite of having much more memory available in the sort-buffer. The word-count for e.g. results in ~12 spills (given hdfs block size of 64M). The presence of a combiner exacerbates the problem by piling serialization/deserialization of records too...

      Sure, jobs can configure io.sort.record.percent, but it's tedious and obscure; we really can do better by getting the framework to automagically pick it by using all available memory (upto io.sort.mb) for either the data or accounting.

      1. M64-0.patch
        80 kB
        Chris Douglas
      2. M64-0i.png
        30 kB
        Chris Douglas
      3. M64-1.patch
        86 kB
        Chris Douglas
      4. M64-10.patch
        114 kB
        Chris Douglas
      5. M64-1i.png
        32 kB
        Chris Douglas
      6. M64-2.patch
        89 kB
        Chris Douglas
      7. M64-2i.png
        29 kB
        Chris Douglas
      8. M64-3.patch
        93 kB
        Chris Douglas
      9. M64-4.patch
        106 kB
        Chris Douglas
      10. M64-5.patch
        108 kB
        Chris Douglas
      11. M64-6.patch
        108 kB
        Chris Douglas
      12. M64-7.patch
        107 kB
        Chris Douglas
      13. M64-8.patch
        108 kB
        Chris Douglas
      14. M64-9.patch
        106 kB
        Chris Douglas

        Activity

        Arun C Murthy created issue -
        Owen O'Malley made changes -
        Field Original Value New Value
        Project Hadoop Common [ 12310240 ] Hadoop Map/Reduce [ 12310941 ]
        Key HADOOP-5108 MAPREDUCE-64
        Affects Version/s 0.20.0 [ 12313438 ]
        Component/s mapred [ 12310690 ]
        Chris Douglas made changes -
        Attachment M64-0.patch [ 12421173 ]
        Chris Douglas made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Assignee Chris Douglas [ chris.douglas ]
        Chris Douglas made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Chris Douglas made changes -
        Attachment M64-1.patch [ 12421799 ]
        Chris Douglas made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Chris Douglas made changes -
        Attachment M64-2.patch [ 12422203 ]
        Chris Douglas made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Chris Douglas made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Chris Douglas made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Chris Douglas made changes -
        Attachment M64-3.patch [ 12422471 ]
        Chris Douglas made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Chris Douglas made changes -
        Attachment M64-2i.png [ 12422473 ]
        Attachment M64-1i.png [ 12422474 ]
        Attachment M64-0i.png [ 12422475 ]
        Chris Douglas made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Chris Douglas made changes -
        Attachment M64-4.patch [ 12427438 ]
        Chris Douglas made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Chris Douglas made changes -
        Attachment M64-5.patch [ 12428718 ]
        Chris Douglas made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Chris Douglas made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Chris Douglas made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Chris Douglas made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Chris Douglas made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Chris Douglas made changes -
        Attachment M64-6.patch [ 12429872 ]
        Chris Douglas made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Chris Douglas made changes -
        Attachment M64-7.patch [ 12429943 ]
        Chris Douglas made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Chris Douglas made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Chris Douglas made changes -
        Attachment M64-8.patch [ 12431189 ]
        Chris Douglas made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Chris Douglas made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Chris Douglas made changes -
        Attachment M64-9.patch [ 12431283 ]
        Chris Douglas made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Chris Douglas made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Chris Douglas made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Chris Douglas made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Chris Douglas made changes -
        Attachment M64-10.patch [ 12434447 ]
        Chris Douglas made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Chris Douglas made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Chris Douglas made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Hadoop Flags [Reviewed]
        Fix Version/s 0.22.0 [ 12314184 ]
        Resolution Fixed [ 1 ]
        Tom White made changes -
        Fix Version/s 0.21.0 [ 12314045 ]
        Fix Version/s 0.22.0 [ 12314184 ]
        Tom White made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Todd Lipcon made changes -
        Component/s performance [ 12316500 ]
        Component/s task [ 12312920 ]

          People

          • Assignee:
            Chris Douglas
            Reporter:
            Arun C Murthy
          • Votes:
            0 Vote for this issue
            Watchers:
            25 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development