Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4549

Distributed cache conflicts breaks backwards compatability

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.23.3, 2.0.2-alpha
    • Fix Version/s: 0.23.5, 2.0.4-alpha
    • Component/s: mrv2
    • Labels:
      None

      Description

      I recently put in MAPREDUCE-4503 which went a bit too far, and broke backwards compatibility with 1.0 in distribtued cache entries. instead of changing the behavior of the distributed cache to more closely match 1.0 behavior I want to just change the exception to a warning message informing the users that it will become an error in 2.0

      1. MR-4549-branch-0.23.txt
        11 kB
        Robert Joseph Evans
      2. MAPREDUCE-4549-trunk.patch
        7 kB
        Sandy Ryza

        Activity

        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        71d 23h 32m 1 Robert Joseph Evans 24/Oct/12 15:01
        Resolved Resolved Reopened Reopened
        47d 11h 46m 1 Sandy Ryza 11/Dec/12 01:47
        Reopened Reopened Patch Available Patch Available
        17h 29m 1 Sandy Ryza 11/Dec/12 19:16
        Patch Available Patch Available Resolved Resolved
        107d 2h 18m 1 Roman Shaposhnik 28/Mar/13 21:34
        Resolved Resolved Closed Closed
        28d 4h 39m 1 Arun C Murthy 26/Apr/13 03:14
        Arun C Murthy made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Roman Shaposhnik made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Roman Shaposhnik added a comment -

        All the findings are attached over here: MAPREDUCE-4820

        At this point I'd agree with Arun and close this one. We can track what's left over at MAPREDUCE-4820

        Show
        Roman Shaposhnik added a comment - All the findings are attached over here: MAPREDUCE-4820 At this point I'd agree with Arun and close this one. We can track what's left over at MAPREDUCE-4820
        Hide
        Arun C Murthy added a comment -

        Ok, that is news to me. Can we please close this and open a new jira then? Tx.

        Show
        Arun C Murthy added a comment - Ok, that is news to me. Can we please close this and open a new jira then? Tx.
        Hide
        Alejandro Abdelnur added a comment -

        I think the problem is different, Roman would you please post the findings you send me.

        Thx

        Show
        Alejandro Abdelnur added a comment - I think the problem is different, Roman would you please post the findings you send me. Thx
        Hide
        Arun C Murthy added a comment -

        Also, verified that the same fix is present in branch-2.0.4-alpha.

        Show
        Arun C Murthy added a comment - Also, verified that the same fix is present in branch-2.0.4-alpha.
        Hide
        Arun C Murthy added a comment -

        So, not sure why Oozie is still having problems... Roman, can you please re-check?

        Show
        Arun C Murthy added a comment - So, not sure why Oozie is still having problems... Roman, can you please re-check?
        Hide
        Arun C Murthy added a comment -

        Here is what I see on branch-2 from 'git log':

        commit 9aba3ebb2d455932981cc37fe8e3fa7a6ec4da82
        Author: Alejandro Abdelnur <tucu@apache.org>
        Date: Tue Dec 11 19:50:32 2012 +0000

        MAPREDUCE-4549. Distributed cache conflicts breaks backwards compatability. (Robert Evans via tucu)

        git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1420363 13f79535-47bb-0310-9956-ffa450edef68

        Show
        Arun C Murthy added a comment - Here is what I see on branch-2 from 'git log': commit 9aba3ebb2d455932981cc37fe8e3fa7a6ec4da82 Author: Alejandro Abdelnur <tucu@apache.org> Date: Tue Dec 11 19:50:32 2012 +0000 MAPREDUCE-4549 . Distributed cache conflicts breaks backwards compatability. (Robert Evans via tucu) git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1420363 13f79535-47bb-0310-9956-ffa450edef68
        Roman Shaposhnik made changes -
        Fix Version/s 2.0.4-alpha [ 12324138 ]
        Fix Version/s 2.0.5-beta [ 12324032 ]
        Priority Critical [ 2 ] Blocker [ 1 ]
        Hide
        Roman Shaposhnik added a comment -

        Making this a blocker for 2.0.4-alpha as agreed. I will keep attaching logs/debugging data from Oozie runs to MAPREDUCE-4820

        Show
        Roman Shaposhnik added a comment - Making this a blocker for 2.0.4-alpha as agreed. I will keep attaching logs/debugging data from Oozie runs to MAPREDUCE-4820
        Arun C Murthy made changes -
        Fix Version/s 2.0.4-beta [ 12324032 ]
        Fix Version/s 2.0.3-alpha [ 12323275 ]
        Hide
        Robert Joseph Evans added a comment -

        Do we want it on trunk?

        For example if I pass in -libjar a/foo.jar -libjar b/foo.jar in 1.0 everything will work and both foo.jar files will be on the classpath (I don't know why you would do this, but it is possible). Under YARN only one of them will even be shipped in the distributed cache, so there is no way at all for both of them to be on the classpath.

        With this patch it will output a warning message when a conflict occurs so people can try and fix the issue, but eventually we want to have the upper parts of the stack fix it, or we need to do what I was looking at doing initially and try to keep the colliding files separate by adding in support of sub directories in the distributed cache and using UUIDs.

        Show
        Robert Joseph Evans added a comment - Do we want it on trunk? For example if I pass in -libjar a/foo.jar -libjar b/foo.jar in 1.0 everything will work and both foo.jar files will be on the classpath (I don't know why you would do this, but it is possible). Under YARN only one of them will even be shipped in the distributed cache, so there is no way at all for both of them to be on the classpath. With this patch it will output a warning message when a conflict occurs so people can try and fix the issue, but eventually we want to have the upper parts of the stack fix it, or we need to do what I was looking at doing initially and try to keep the colliding files separate by adding in support of sub directories in the distributed cache and using UUIDs.
        Hide
        Arun C Murthy added a comment -

        Has this been committed to trunk? Can we close this?

        Show
        Arun C Murthy added a comment - Has this been committed to trunk? Can we close this?
        Hide
        Robert Joseph Evans added a comment -

        I am fine with this being in branch-2. +1

        Show
        Robert Joseph Evans added a comment - I am fine with this being in branch-2. +1
        Hide
        Alejandro Abdelnur added a comment - - edited

        Committed to branch-2 now, I'll wait till FRI noon to see if there are objections for committing this to trunk as well.

        Show
        Alejandro Abdelnur added a comment - - edited Committed to branch-2 now, I'll wait till FRI noon to see if there are objections for committing this to trunk as well.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12560334/MAPREDUCE-4549-trunk.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3118//testReport/
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3118//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12560334/MAPREDUCE-4549-trunk.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3118//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3118//console This message is automatically generated.
        Hide
        Alejandro Abdelnur added a comment -

        +1 pending jenkins.

        Show
        Alejandro Abdelnur added a comment - +1 pending jenkins.
        Hide
        Sandy Ryza added a comment -

        Reopened and uploaded a patch for trunk

        Show
        Sandy Ryza added a comment - Reopened and uploaded a patch for trunk
        Sandy Ryza made changes -
        Status Reopened [ 4 ] Patch Available [ 10002 ]
        Sandy Ryza made changes -
        Attachment MAPREDUCE-4549-trunk.patch [ 12560334 ]
        Sandy Ryza made changes -
        Fix Version/s 2.0.3-alpha [ 12323275 ]
        Affects Version/s 2.0.2-alpha [ 12322471 ]
        Sandy Ryza made changes -
        Resolution Fixed [ 1 ]
        Status Resolved [ 5 ] Reopened [ 4 ]
        Robert Joseph Evans made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Robert Joseph Evans made changes -
        Fix Version/s 0.23.5 [ 12323312 ]
        Fix Version/s 0.23.4 [ 12323264 ]
        Robert Joseph Evans made changes -
        Fix Version/s 0.23.4 [ 12323264 ]
        Fix Version/s 0.23.3 [ 12320060 ]
        Hide
        Robert Joseph Evans added a comment -

        This JIRA has been lingering for a while. Alejandro, if you want this in branch-2 feel free to merge it in there and then resolve the JIRA, if not, I will just resolve the JIRA myself in a week or so.

        Show
        Robert Joseph Evans added a comment - This JIRA has been lingering for a while. Alejandro, if you want this in branch-2 feel free to merge it in there and then resolve the JIRA, if not, I will just resolve the JIRA myself in a week or so.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-0.23-Build #347 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/347/)
        MAPREDUCE-4549. Distributed cache conflicts breaks backwards compatability (Robert Evans via tgraves) (Revision 1374407)

        Result = SUCCESS
        tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1374407
        Files :

        • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java
        • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #347 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/347/ ) MAPREDUCE-4549 . Distributed cache conflicts breaks backwards compatability (Robert Evans via tgraves) (Revision 1374407) Result = SUCCESS tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1374407 Files : /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java
        Thomas Graves made changes -
        Fix Version/s 0.23.3 [ 12320060 ]
        Hide
        Thomas Graves added a comment -

        I committed this to 0.23.3, leaving jira open until we get agreement on branch-2

        Show
        Thomas Graves added a comment - I committed this to 0.23.3, leaving jira open until we get agreement on branch-2
        Hide
        Thomas Graves added a comment -

        +1 for 0.23 change. I'm going to commit this to 0.23.

        Arun, Alejandro - is there agreement on reverting MAPREDUCE-4503 from 2.0 and pulling this jira in there too?

        Show
        Thomas Graves added a comment - +1 for 0.23 change. I'm going to commit this to 0.23. Arun, Alejandro - is there agreement on reverting MAPREDUCE-4503 from 2.0 and pulling this jira in there too?
        Hide
        Robert Joseph Evans added a comment -

        That is fine with me. I realize that a lot of people are not going to use 0.23 as a bridge between 1.0 and 2.0 so if we want it to be deprecated in 2.0 too that is fine. If we revert MAPREDUCE-4503 this patch should apply cleanly to branch-2 also.

        Show
        Robert Joseph Evans added a comment - That is fine with me. I realize that a lot of people are not going to use 0.23 as a bridge between 1.0 and 2.0 so if we want it to be deprecated in 2.0 too that is fine. If we revert MAPREDUCE-4503 this patch should apply cleanly to branch-2 also.
        Hide
        Alejandro Abdelnur added a comment -

        No worries, no need to be sorry at all. Thanks for the details.

        IMO we should do the warning for Hadoop 2.x as well. Maybe in Hadoop 3.x converted to an exception.

        Show
        Alejandro Abdelnur added a comment - No worries, no need to be sorry at all. Thanks for the details. IMO we should do the warning for Hadoop 2.x as well. Maybe in Hadoop 3.x converted to an exception.
        Hide
        Robert Joseph Evans added a comment -

        Alejandro,

        Sorry you are right I was way too vague about who I talked with. "oozie people" does not really say anything about who they are or exactly what we talked about. Thank you for calling me out on this it was my bad.

        I spoke with Mohammad Islam and Virag Kothari. The extent of the conversation was primarily me clarifying the differences between 1.0 behavior of the distributed cache, the behavior of the cache in 2.0/0.23 prior to MAPREDUCE-4503, and its behavior post MAPREDUCE-4503. After that we talked about my proposed fix here, of turning the exception into a warning. They both were OK with that but wanted to perhaps look into having the error checking in 2.0 also happen as new items were added through the distributed cache APIs so that they could react to the issues on a per entry basis, instead of having the job submission fail. I thought this was reasonable and told them to file a JIRA against MAPREDUCE for it.

        I am happy for any other feedback people may have on the issues. The call was mostly for clarification. I probably didn't even need to mention it here, but since I did here is the full disclosure.

        Show
        Robert Joseph Evans added a comment - Alejandro, Sorry you are right I was way too vague about who I talked with. "oozie people" does not really say anything about who they are or exactly what we talked about. Thank you for calling me out on this it was my bad. I spoke with Mohammad Islam and Virag Kothari. The extent of the conversation was primarily me clarifying the differences between 1.0 behavior of the distributed cache, the behavior of the cache in 2.0/0.23 prior to MAPREDUCE-4503 , and its behavior post MAPREDUCE-4503 . After that we talked about my proposed fix here, of turning the exception into a warning. They both were OK with that but wanted to perhaps look into having the error checking in 2.0 also happen as new items were added through the distributed cache APIs so that they could react to the issues on a per entry basis, instead of having the job submission fail. I thought this was reasonable and told them to file a JIRA against MAPREDUCE for it. I am happy for any other feedback people may have on the issues. The call was mostly for clarification. I probably didn't even need to mention it here, but since I did here is the full disclosure.
        Hide
        Alejandro Abdelnur added a comment -

        Where was this discussed with the Oozie people? I'm not aware of this, I did not see any discussion in the oozie-dev@ alias.

        Would you please summarize the issue, how it affects Oozie and what is exactly the impact of leaving it incompatible in Hadoop 2?

        At first look I think we should keep backwards compatibility in Hadoop 2.

        Show
        Alejandro Abdelnur added a comment - Where was this discussed with the Oozie people? I'm not aware of this, I did not see any discussion in the oozie-dev@ alias. Would you please summarize the issue, how it affects Oozie and what is exactly the impact of leaving it incompatible in Hadoop 2? At first look I think we should keep backwards compatibility in Hadoop 2.
        Hide
        Jason Lowe added a comment -

        +1 (non-binding), lgtm.

        Show
        Jason Lowe added a comment - +1 (non-binding), lgtm.
        Robert Joseph Evans made changes -
        Attachment MR-4549-branch-0.23.txt [ 12541228 ]
        Hide
        Robert Joseph Evans added a comment -

        This patch will only apply to branch-0.23

        Show
        Robert Joseph Evans added a comment - This patch will only apply to branch-0.23
        Robert Joseph Evans made changes -
        Field Original Value New Value
        Affects Version/s 3.0.0 [ 12320355 ]
        Affects Version/s 2.1.0-alpha [ 12321442 ]
        Affects Version/s 2.2.0-alpha [ 12322471 ]
        Target Version/s 0.23.3 [ 12320060 ]
        Description I recently put in MAPREDUCE-4503 which went a bit too far, and broke backwards compatibility with 1.0 in distribtued cache entries. This is to change the behavior of the distributed cache to more closely match that of 1.0.

        In 1.0 when adding in a cache archive link the first link would win (be the one that was created), not the last one as is the current behavior, when there were conflicts then all of the others were ignored and just did not get a symlink created, and finally no symlink was created for archives that had did not have a fragment in the URL.

        To simulate this behavior after we parse the cache files and cache archives configuration we should walk through all conflicting links and pick the first link that has a fragment to win. If no link has a fragment then it is just the first link wins. All other conflicting links will have a warning an the name of the link will be changed to include a UUID. If the same file is both in the distributed cache as a cache file and a cache archive we will throw an exception, for backwards compatibility.
        I recently put in MAPREDUCE-4503 which went a bit too far, and broke backwards compatibility with 1.0 in distribtued cache entries. instead of changing the behavior of the distributed cache to more closely match 1.0 behavior I want to just change the exception to a warning message informing the users that it will become an error in 2.0
        Hide
        Robert Joseph Evans added a comment -

        After talking to Arun and several Oozie people about I have decided that for branch-2 and trunk we will keep the same behavior as now, and for 0.23.3 we will change the exception into a warning. This will give oozie, and others time to deal with the incompatibility and not block 0.23.3 from being released.

        Show
        Robert Joseph Evans added a comment - After talking to Arun and several Oozie people about I have decided that for branch-2 and trunk we will keep the same behavior as now, and for 0.23.3 we will change the exception into a warning. This will give oozie, and others time to deal with the incompatibility and not block 0.23.3 from being released.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-0.23-Build #344 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/344/)
        svn merge -c -1369197 Reverting: MAPREDUCE-4503 in branch-0.23 until MAPREDUCE-4549 can be addressed. (Revision 1372573)

        Result = SUCCESS
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1372573
        Files :

        • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java
        • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #344 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/344/ ) svn merge -c -1369197 Reverting: MAPREDUCE-4503 in branch-0.23 until MAPREDUCE-4549 can be addressed. (Revision 1372573) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1372573 Files : /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java
        Robert Joseph Evans created issue -

          People

          • Assignee:
            Robert Joseph Evans
            Reporter:
            Robert Joseph Evans
          • Votes:
            0 Vote for this issue
            Watchers:
            17 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development