Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4812

Create reduce input merger plugin in ReduceTask.java and pass it to Shuffle

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: 2.0.2-alpha
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      This is part of MAPREDUCE-2454. This further breaks down MAPREDUCE-4808

      1. COMBO-mapreduce-4809-4812.patch
        36 kB
        Mariappan Asokan
      2. mapreduce-4812.patch
        26 kB
        Mariappan Asokan
      3. COMBO-mapreduce-4809-4812.patch
        56 kB
        Mariappan Asokan
      4. mapreduce-4812.patch
        48 kB
        Mariappan Asokan
      5. mapreduce-4812.patch
        40 kB
        Mariappan Asokan
      6. mapreduce-4812.patch
        40 kB
        Mariappan Asokan
      7. mapreduce-4812.patch
        19 kB
        Mariappan Asokan

        Issue Links

          Activity

          Hide
          Alejandro Abdelnur added a comment -

          Thanks Asokan, this makes sense, it will reduce further the size of MAPREDUCE-4808.

          I'm also marking this as a related to MAPREDUCE-4049 as this change will affect the ShuffleContext object. I think we should commit this one and then MAPREDUCE-4049 on top of it.

          Show
          Alejandro Abdelnur added a comment - Thanks Asokan, this makes sense, it will reduce further the size of MAPREDUCE-4808 . I'm also marking this as a related to MAPREDUCE-4049 as this change will affect the ShuffleContext object. I think we should commit this one and then MAPREDUCE-4049 on top of it.
          Hide
          Mariappan Asokan added a comment -

          Hi Arun & Alejandro,
          I am uploading a patch that implements the bare-bones requirements for the reduce input merger plugin. I will create a simple mock test for the plugin and post it with some protection changes so that the test will run.

          Arun, in this patch an instance of ReduceInputMerger plugin is passed to Shuffle constructor as you asked for. I figured out a way to implement sort avoidance plugin. However, I need MAPREDUCE-4049(shuffle plugin) to be committed before that.

          Thanks.
          – Asokan

          Show
          Mariappan Asokan added a comment - Hi Arun & Alejandro, I am uploading a patch that implements the bare-bones requirements for the reduce input merger plugin. I will create a simple mock test for the plugin and post it with some protection changes so that the test will run. Arun, in this patch an instance of ReduceInputMerger plugin is passed to Shuffle constructor as you asked for. I figured out a way to implement sort avoidance plugin. However, I need MAPREDUCE-4049 (shuffle plugin) to be committed before that. Thanks. – Asokan
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12554427/mapreduce-4812.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3044//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3044//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12554427/mapreduce-4812.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3044//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3044//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          Asokan, it looks good. Similar to what has been suggested/done in MAPREDUCE-4049 the MergerManager.initialize() method should receive a context with all parameter. This Context could be a static inner class in the ReduceInputMerger interface. Also the initialize() method should be renamed to init() to be consistent with the shuffle plugin.

          Show
          Alejandro Abdelnur added a comment - Asokan, it looks good. Similar to what has been suggested/done in MAPREDUCE-4049 the MergerManager.initialize() method should receive a context with all parameter. This Context could be a static inner class in the ReduceInputMerger interface. Also the initialize() method should be renamed to init() to be consistent with the shuffle plugin.
          Hide
          Mariappan Asokan added a comment -

          Hi Alejandro,
          Do you want me to do similar thing for MapOutputCollector interface(change initialize() to init() and create a Context though there are only three parameters)?
          Thanks.

          – Asokan

          Show
          Mariappan Asokan added a comment - Hi Alejandro, Do you want me to do similar thing for MapOutputCollector interface(change initialize() to init() and create a Context though there are only three parameters)? Thanks. – Asokan
          Hide
          Alejandro Abdelnur added a comment -

          Asokan, that is a good point, it didn't seem obvious there before because the params passed are just 3. Please do. Thx

          Show
          Alejandro Abdelnur added a comment - Asokan, that is a good point, it didn't seem obvious there before because the params passed are just 3. Please do. Thx
          Hide
          Mariappan Asokan added a comment -

          Hi Alejandro,
          I have created static context class and changed the method name from initialize() to init() as you suggested. It will receive only a context object.

          Please review and give your comments.

          Thanks.
          – Asokan

          Show
          Mariappan Asokan added a comment - Hi Alejandro, I have created static context class and changed the method name from initialize() to init() as you suggested. It will receive only a context object. Please review and give your comments. Thanks. – Asokan
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12554592/mapreduce-4812.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3052//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3052//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12554592/mapreduce-4812.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3052//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3052//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          Analogous comments like in MAPREDUCE-4807.

          It looks good, you indicated you'd be creating a patch with a test. I'll wait for it for +1ing it.

          Minor naming nits:

          • The property mapreduce.job.reduce.merge.classs, I think a better name would be mapreduce.job.reduce.input.merger.classs
          • The ReduceMergerContext inner class, it could simply be called Context (as the outer class fully defines it)
          Show
          Alejandro Abdelnur added a comment - Analogous comments like in MAPREDUCE-4807 . It looks good, you indicated you'd be creating a patch with a test. I'll wait for it for +1ing it. Minor naming nits: The property mapreduce.job.reduce.merge.classs , I think a better name would be mapreduce.job.reduce.input.merger.classs The ReduceMergerContext inner class, it could simply be called Context (as the outer class fully defines it)
          Hide
          Mariappan Asokan added a comment -

          Hi Alejandro,
          Thanks for your comments. I am uploading the patch incorporating the name changes. However, I cannot even create a mock plugin test unless I create a combo of MAPREDUCE-4809 and MAPREDUCE-4812. I will do that and post a patch with a mock plugin test.

          – Asokan

          Show
          Mariappan Asokan added a comment - Hi Alejandro, Thanks for your comments. I am uploading the patch incorporating the name changes. However, I cannot even create a mock plugin test unless I create a combo of MAPREDUCE-4809 and MAPREDUCE-4812 . I will do that and post a patch with a mock plugin test. – Asokan
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12554716/mapreduce-4812.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3062//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3062//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12554716/mapreduce-4812.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3062//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3062//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          In the case of the local run, the ReduceTask is hardcoded using the Merger, the ReduceInputMerger impl is not being used. Is that intentional?

          Show
          Alejandro Abdelnur added a comment - In the case of the local run, the ReduceTask is hardcoded using the Merger, the ReduceInputMerger impl is not being used. Is that intentional?
          Hide
          Mariappan Asokan added a comment -

          Hi Alejandro,
          You are right. That is intentional. I will do it as part of MAPREDUCE-4808.

          – Asokan

          Show
          Mariappan Asokan added a comment - Hi Alejandro, You are right. That is intentional. I will do it as part of MAPREDUCE-4808 . – Asokan
          Hide
          Mariappan Asokan added a comment -

          Created a combo patch of 4812 and 4809 and added a mock plugin test for ReduceInputMerger.

          – Asokan

          Show
          Mariappan Asokan added a comment - Created a combo patch of 4812 and 4809 and added a mock plugin test for ReduceInputMerger. – Asokan
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12554755/COMBO-mapreduce-4812-4809.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3065//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3065//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12554755/COMBO-mapreduce-4812-4809.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3065//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3065//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          +1

          Show
          Alejandro Abdelnur added a comment - +1
          Hide
          Mariappan Asokan added a comment -

          Hi Alejandro,
          I fixed a minor problem in the incremental patch file mapreduce-4812.patch. I am also uploading the combo patch file. I renamed it to be consistent with the sequence of patches applied to it from left to right.

          Thanks.
          – Asokan

          Show
          Mariappan Asokan added a comment - Hi Alejandro, I fixed a minor problem in the incremental patch file mapreduce-4812.patch. I am also uploading the combo patch file. I renamed it to be consistent with the sequence of patches applied to it from left to right. Thanks. – Asokan
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12554779/COMBO-mapreduce-4809-4812.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3068//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3068//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12554779/COMBO-mapreduce-4809-4812.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3068//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3068//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          +1

          Show
          Alejandro Abdelnur added a comment - +1
          Hide
          Mariappan Asokan added a comment -

          Hi Alejandro and Arun,
          I simplified the patch somewhat. Please take a look.

          Thanks,
          – Asokan

          Show
          Mariappan Asokan added a comment - Hi Alejandro and Arun, I simplified the patch somewhat. Please take a look. Thanks, – Asokan
          Hide
          Alejandro Abdelnur added a comment -

          +1 pending jenkins run with the combo in trunk.

          Show
          Alejandro Abdelnur added a comment - +1 pending jenkins run with the combo in trunk.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12555404/COMBO-mapreduce-4809-4812.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3084//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3084//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12555404/COMBO-mapreduce-4809-4812.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3084//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3084//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          Arun C Murthy, patch looks good to me, I'll wait till tomorrow morning to commit in case you find something, or if you +1 it before then.

          Show
          Alejandro Abdelnur added a comment - Arun C Murthy , patch looks good to me, I'll wait till tomorrow morning to commit in case you find something, or if you +1 it before then.
          Hide
          Arun C Murthy added a comment -

          Alejandro Abdelnur I'm going to need more time to review this, thanks.

          Show
          Arun C Murthy added a comment - Alejandro Abdelnur I'm going to need more time to review this, thanks.
          Hide
          Alejandro Abdelnur added a comment -

          Arun, as I have not heard from you I'll be committing this patch later this morning.

          Show
          Alejandro Abdelnur added a comment - Arun, as I have not heard from you I'll be committing this patch later this morning.
          Hide
          Arun C Murthy added a comment -

          Is there a hurry? I'll need a couple more days, I have several comments already.

          Show
          Arun C Murthy added a comment - Is there a hurry? I'll need a couple more days, I have several comments already.
          Hide
          Arun C Murthy added a comment -

          Alejandro Abdelnur we are introducing new interfaces to an incredibly important piece of the code (see issues like MAPREDUCE-3721 & MAPREDUCE-4842). Further this is an area where you don't have a history. So, I'd appreciate if you could stop rushing me.

          I've already spent a HUGE amount of time on this reviewing, see MAPREDUCE-2454, MAPREDUCE-4807 & MAPREDUCE-4809. This is close, let's get this right.

          Thanks.

          Show
          Arun C Murthy added a comment - Alejandro Abdelnur we are introducing new interfaces to an incredibly important piece of the code (see issues like MAPREDUCE-3721 & MAPREDUCE-4842 ). Further this is an area where you don't have a history. So, I'd appreciate if you could stop rushing me. I've already spent a HUGE amount of time on this reviewing, see MAPREDUCE-2454 , MAPREDUCE-4807 & MAPREDUCE-4809 . This is close, let's get this right. Thanks.
          Hide
          Mariappan Asokan added a comment -

          Hi Arun,
          I have some ideas to fix the problem in MAPREDUCE-4842. I posted my comments there. Please take a look.

          Thanks.

          – Asokan

          Show
          Mariappan Asokan added a comment - Hi Arun, I have some ideas to fix the problem in MAPREDUCE-4842 . I posted my comments there. Please take a look. Thanks. – Asokan
          Hide
          Arun C Murthy added a comment -

          Asokan, I see the same interface in MAPREDUCE-4808 too which I have concerns about.

          Can you please help me understand what patch I should be reviewing?

          Show
          Arun C Murthy added a comment - Asokan, I see the same interface in MAPREDUCE-4808 too which I have concerns about. Can you please help me understand what patch I should be reviewing?
          Hide
          Arun C Murthy added a comment -

          Alejandro - since you +1ed this jira, can you pls explain the rationale for the apis I've asked in MAPREDUCE-4808?

          Maybe you and Asokan had an offline conversation?

          Show
          Arun C Murthy added a comment - Alejandro - since you +1ed this jira, can you pls explain the rationale for the apis I've asked in MAPREDUCE-4808 ? Maybe you and Asokan had an offline conversation?
          Hide
          Alejandro Abdelnur added a comment -

          MAPREDUCE-4812 (this JIRA) makes the MergerManager is made pluggable, nothing else.

          In the current code, in the local case where Shuffle is not involved, the MergerManager is not being used; instead the Merger class is used directly.

          With MAPREDUCE-4808, the MergerManager is augmented to also handle the local case by adding a new method and moving the Merger.merge() invocation to it.

          MAPREDUCE-4808 introduces the merge pluggability for the local case when shuffle is not in the picture.

          My understanding is that Asokan split the pluggability of MergeManager from augmenting its functionality to handle the local case to keep the changes focused.

          I'm OK with folding both in a single JIRA if you think it makes more sense..

          Show
          Alejandro Abdelnur added a comment - MAPREDUCE-4812 (this JIRA) makes the MergerManager is made pluggable, nothing else. In the current code, in the local case where Shuffle is not involved, the MergerManager is not being used; instead the Merger class is used directly. With MAPREDUCE-4808 , the MergerManager is augmented to also handle the local case by adding a new method and moving the Merger.merge() invocation to it. MAPREDUCE-4808 introduces the merge pluggability for the local case when shuffle is not in the picture. My understanding is that Asokan split the pluggability of MergeManager from augmenting its functionality to handle the local case to keep the changes focused. I'm OK with folding both in a single JIRA if you think it makes more sense..
          Hide
          Mariappan Asokan added a comment -

          Hi Arun,
          Sorry I did not get back sooner. The intention of ReduceInputMerger interface is to have a pluggable MergeManager implementation. For a non-local job, Shuffle and MergeManager interact and synchronize with each other using the three methods waitForInMemoryMerge(), reserve(), and close(). So in order to use the Shuffle these methods are captured in ReduceInputMerger interface. I renamed waitForInMemoryMerge() to a generic name waitForResource() since the plugin implementation may not have the concept of in-memory merge.
          Since the return value from reserve() is MapOutput, I did some refactoring of MapOutput so that plugin can return its own implementation of it. I kept the refactoring done on MapOutput in MAPREDUCE-4808. With just MAPREDUCE-4812, an external plugin is not possible, but it has the core part of the concepts so that it is easy to review just ReduceInputMerger design. Similarly, for a local job the input is coming from local files. I enhanced ReduceInputMerger with one more method for this. It is also kept in MAPREDUCE-4808.

          Hope I explained well. Please let me know if you have any more questions.

          Thanks.

          – Asokan

          Show
          Mariappan Asokan added a comment - Hi Arun, Sorry I did not get back sooner. The intention of ReduceInputMerger interface is to have a pluggable MergeManager implementation. For a non-local job, Shuffle and MergeManager interact and synchronize with each other using the three methods waitForInMemoryMerge(), reserve(), and close() . So in order to use the Shuffle these methods are captured in ReduceInputMerger interface. I renamed waitForInMemoryMerge() to a generic name waitForResource() since the plugin implementation may not have the concept of in-memory merge. Since the return value from reserve() is MapOutput , I did some refactoring of MapOutput so that plugin can return its own implementation of it. I kept the refactoring done on MapOutput in MAPREDUCE-4808 . With just MAPREDUCE-4812 , an external plugin is not possible, but it has the core part of the concepts so that it is easy to review just ReduceInputMerger design. Similarly, for a local job the input is coming from local files. I enhanced ReduceInputMerger with one more method for this. It is also kept in MAPREDUCE-4808 . Hope I explained well. Please let me know if you have any more questions. Thanks. – Asokan
          Hide
          Arun C Murthy added a comment -

          For now I'll close this as a dup of MAPREDUCE-4808, we are having same discussions in both places

          Show
          Arun C Murthy added a comment - For now I'll close this as a dup of MAPREDUCE-4808 , we are having same discussions in both places
          Hide
          Mariappan Asokan added a comment -

          Hi Arun,
          That is fine with me. I will post a patch for MAPREDUCE-4808.

          – Asokan

          Show
          Mariappan Asokan added a comment - Hi Arun, That is fine with me. I will post a patch for MAPREDUCE-4808 . – Asokan

            People

            • Assignee:
              Mariappan Asokan
              Reporter:
              Mariappan Asokan
            • Votes:
              1 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development