Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4502

Node-level aggregation with combining the result of maps

    Details

    • Type: Improvement Improvement
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 3.0.0
    • Fix Version/s: None
    • Component/s: applicationmaster, mrv2
    • Labels:

      Description

      The shuffle costs is expensive in Hadoop in spite of the existence of combiner, because the scope of combining is limited within only one MapTask. To solve this problem, it's a good way to aggregate the result of maps per node/rack by launch combiner.

      This JIRA is to implement the multi-level aggregation infrastructure, including combining per container(MAPREDUCE-3902 is related), coordinating containers by application master without breaking fault tolerance of jobs.

      1. MAPREDUCE-4502.10.patch
        109 kB
        Tsuyoshi Ozawa
      2. MAPREDUCE-4502.9.patch
        109 kB
        Tsuyoshi Ozawa
      3. MAPREDUCE-4502.9.patch
        109 kB
        Tsuyoshi Ozawa
      4. MAPREDUCE-4502.8.patch
        117 kB
        Tsuyoshi Ozawa
      5. MAPREDUCE-4502.8.patch
        117 kB
        Tsuyoshi Ozawa
      6. MAPREDUCE-4502.7.patch
        117 kB
        Tsuyoshi Ozawa
      7. design_v3.pdf
        440 kB
        Tsuyoshi Ozawa
      8. MAPREDUCE-4502.6.patch
        116 kB
        Tsuyoshi Ozawa
      9. MAPREDUCE-4502.5.patch
        115 kB
        Tsuyoshi Ozawa
      10. MAPREDUCE-4502.4.patch
        115 kB
        Tsuyoshi Ozawa
      11. MAPREDUCE-4502.3.patch
        115 kB
        Tsuyoshi Ozawa
      12. MAPREDUCE-4502.2.patch
        115 kB
        Tsuyoshi Ozawa
      13. MAPREDUCE-4502.1.patch
        113 kB
        Tsuyoshi Ozawa
      14. MAPREDUCE-4525-pof.diff
        71 kB
        Tsuyoshi Ozawa
      15. design_v2.pdf
        347 kB
        Tsuyoshi Ozawa
      16. speculative_draft.pdf
        142 kB
        Tsuyoshi Ozawa

        Issue Links

          Activity

          Allen Wittenauer made changes -
          Labels BB2015-05-TBR
          Hyunsik Choi made changes -
          Link This issue is related to TAJO-374 [ TAJO-374 ]
          Tsuyoshi Ozawa made changes -
          Attachment MAPREDUCE-4502.10.patch [ 12592783 ]
          Tsuyoshi Ozawa made changes -
          Attachment MAPREDUCE-4502.9.patch [ 12592748 ]
          Tsuyoshi Ozawa made changes -
          Attachment MAPREDUCE-4502.9.patch [ 12592745 ]
          Tsuyoshi Ozawa made changes -
          Attachment MAPREDUCE-4502.8.patch [ 12581124 ]
          Tsuyoshi Ozawa made changes -
          Attachment MAPREDUCE-4502.8.patch [ 12581063 ]
          Tsuyoshi Ozawa made changes -
          Attachment MAPREDUCE-4502.7.patch [ 12580718 ]
          Tsuyoshi Ozawa made changes -
          Summary Multi-level aggregation with combining the result of maps per node/rack Node-level aggregation with combining the result of maps
          Tsuyoshi Ozawa made changes -
          Attachment design_v3.pdf [ 12579117 ]
          Tsuyoshi Ozawa made changes -
          Attachment MAPREDUCE-4502.6.patch [ 12570748 ]
          Tsuyoshi Ozawa made changes -
          Attachment MAPREDUCE-4502.5.patch [ 12570735 ]
          Tsuyoshi Ozawa made changes -
          Attachment MAPREDUCE-4502.4.patch [ 12570292 ]
          Tsuyoshi Ozawa made changes -
          Attachment MAPREDUCE-4502.3.patch [ 12570272 ]
          Tsuyoshi Ozawa made changes -
          Attachment MAPREDUCE-4502.2.patch [ 12570243 ]
          Tsuyoshi Ozawa made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Affects Version/s 3.0.0 [ 12320355 ]
          Tsuyoshi Ozawa made changes -
          Status In Progress [ 3 ] Open [ 1 ]
          Tsuyoshi Ozawa made changes -
          Attachment MAPREDUCE-4502.1.patch [ 12570068 ]
          Tsuyoshi Ozawa made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          Tsuyoshi Ozawa made changes -
          Attachment MAPREDUCE-4525-pof.diff [ 12559878 ]
          Tsuyoshi Ozawa made changes -
          Attachment design_v2.pdf [ 12546630 ]
          Tsuyoshi Ozawa made changes -
          Attachment speculative_draft.pdf [ 12544415 ]
          Tsuyoshi Ozawa made changes -
          Link This issue is blocked by MAPREDUCE-3902 [ MAPREDUCE-3902 ]
          Tsuyoshi Ozawa made changes -
          Assignee Tsuyoshi OZAWA [ ozawa ]
          Tsuyoshi Ozawa made changes -
          Link This issue is blocked by MAPREDUCE-3902 [ MAPREDUCE-3902 ]
          Tsuyoshi Ozawa made changes -
          Field Original Value New Value
          Description The shuffle costs is expensive in Hadoop in spite of the
          existence of combiner, because the scope of combining is limited
          within only one MapTask. To solve this problem, it's a good way to aggregate the result of maps per node/rack by launch combiner.

          This JIRA is to implement the multi-level aggregation infrastructure, including combining per container(MAPREDUCE-3902 is related), coordinating containers by application master without breaking fault tolerance of jobs.
          The shuffle costs is expensive in Hadoop in spite of the existence of combiner, because the scope of combining is limited within only one MapTask. To solve this problem, it's a good way to aggregate the result of maps per node/rack by launch combiner.

          This JIRA is to implement the multi-level aggregation infrastructure, including combining per container(MAPREDUCE-3902 is related), coordinating containers by application master without breaking fault tolerance of jobs.
          Tsuyoshi Ozawa created issue -

            People

            • Assignee:
              Tsuyoshi Ozawa
              Reporter:
              Tsuyoshi Ozawa
            • Votes:
              0 Vote for this issue
              Watchers:
              37 Start watching this issue

              Dates

              • Created:
                Updated:

                Development