Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5108

Changes needed for Binary Compatibility for MR applications via YARN

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 2.0.3-alpha
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      As we get ready to ship out a beta/stable version of hadoop-2, it makes sense to spend time reviewing support for existing MR applications (hadoop-1) to migrate seamlessly.

      We've done various pieces of work over time, let's track progress and document things clearly. Zhijie Shen has done a bunch of testing and results look very promising so far.

      The aim is to support applications using org.apache.hadoop.mapred.* api in a binary compatible manner in hadoop-2 - thus, users can just take existing MR applications jars, point them at YARN clusters and things just work.

      Clearly, we might have some corner cases (haven't seen many so far), including semantics (not just apis); however the intent is to, at least, document them throughly if not actually fix them as feasible.

      Also, it's clear that we will not be able to support org.apache.hadoop.mapreduce api in a binary compatible manner due to the interface changes we made in hadoop-0.21 (sigh), and hence, users using the new apis will have to re-compile (i.e. source compatible only).

      Net, given that vast majority of users use the org.apache.hadoop.mapred api, it's a very reasonable way to ease migration to hadoop-2.

      1. mr1_mr2_api_diff.tar.gz
        3.03 MB
        Zhijie Shen
      2. MR_API_DIFF_v2.tar.gz
        3.05 MB
        Zhijie Shen
      3. Binary Backward Compatibility.pdf
        224 kB
        Zhijie Shen
      There are no Sub-Tasks for this issue.

        Activity

        Hide
        Luke Lu added a comment -

        In theory we can also support the "old" mapreduce api in a binary compatible manner via MAPREDUCE-1700.

        Show
        Luke Lu added a comment - In theory we can also support the "old" mapreduce api in a binary compatible manner via MAPREDUCE-1700 .
        Hide
        Tom White added a comment -

        Zhijie, did you try using counters in your tests? The counters API was changed in an incompatible way in MAPREDUCE-901, so I'd be surprised if that works.

        The major incompatibilities that I am aware of are listed here: https://issues.apache.org/jira/browse/HADOOP-7738?focusedCommentId=13163731&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13163731

        Show
        Tom White added a comment - Zhijie, did you try using counters in your tests? The counters API was changed in an incompatible way in MAPREDUCE-901 , so I'd be surprised if that works. The major incompatibilities that I am aware of are listed here: https://issues.apache.org/jira/browse/HADOOP-7738?focusedCommentId=13163731&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13163731
        Hide
        Alejandro Abdelnur added a comment -

        Tom is correct, counters API does not work. Oozie uses exclusively the mapred API for job submission/monitoring and counters are an issue. Because of that we need to compile Oozie for Hadoop 1 or Hadoop 2 depending on the target cluster.

        Show
        Alejandro Abdelnur added a comment - Tom is correct, counters API does not work. Oozie uses exclusively the mapred API for job submission/monitoring and counters are an issue. Because of that we need to compile Oozie for Hadoop 1 or Hadoop 2 depending on the target cluster.
        Hide
        Arun C Murthy added a comment -

        Tom White / Alejandro Abdelnur - I thought we fixed Counters with MAPREDUCE-3697. Can you please share details on what issues you are seeing with Oozie? Thanks.

        Show
        Arun C Murthy added a comment - Tom White / Alejandro Abdelnur - I thought we fixed Counters with MAPREDUCE-3697 . Can you please share details on what issues you are seeing with Oozie? Thanks.
        Hide
        Alejandro Abdelnur added a comment -

        Arun, you right, MAPREDUCE-3697 fixed the counters oozie, I've just verified that Oozie 3.3.2 compiled against Hadoop 1.1.1 works against Hadoop 2.0.4 with the right set of Hadoop JARs for a WF job doing MR actions. So apps using the mapred API and compiled with Hadoop 1 seem to be compatible with Hadoop 2 for job submission/monitoring/stats.

        Show
        Alejandro Abdelnur added a comment - Arun, you right, MAPREDUCE-3697 fixed the counters oozie, I've just verified that Oozie 3.3.2 compiled against Hadoop 1.1.1 works against Hadoop 2.0.4 with the right set of Hadoop JARs for a WF job doing MR actions. So apps using the mapred API and compiled with Hadoop 1 seem to be compatible with Hadoop 2 for job submission/monitoring/stats.
        Hide
        Arun C Murthy added a comment -

        Thanks for verifying Alejandro Abdelnur!

        Show
        Arun C Murthy added a comment - Thanks for verifying Alejandro Abdelnur !
        Hide
        Zhijie Shen added a comment -

        I've uploaded a document, which summarized the investigation I've done before. In brief, I've executed the hadoop-1 examples.jar on hadoop-2 to find some problems, used jdiff to create the API diff documents and made some preliminary investigation.

        Show
        Zhijie Shen added a comment - I've uploaded a document, which summarized the investigation I've done before. In brief, I've executed the hadoop-1 examples.jar on hadoop-2 to find some problems, used jdiff to create the API diff documents and made some preliminary investigation.
        Hide
        Zhijie Shen added a comment -

        I've attached the api diff document package as well.

        Show
        Zhijie Shen added a comment - I've attached the api diff document package as well.
        Hide
        Zhijie Shen added a comment -

        Tom White and Alessandro Tommasi, I've tested RandomWriter and RandomTextWriter in examples.jar, which called Reporter#incrCounter. org.apache.hadoop.mapred.Reporter exists in both RM1 and MR2, and its functions have been modified. Therefore, the implementation details about Counter are isolated from these two jobs.

        Show
        Zhijie Shen added a comment - Tom White and Alessandro Tommasi , I've tested RandomWriter and RandomTextWriter in examples.jar, which called Reporter#incrCounter. org.apache.hadoop.mapred.Reporter exists in both RM1 and MR2, and its functions have been modified. Therefore, the implementation details about Counter are isolated from these two jobs.
        Hide
        Tom White added a comment -

        Ah I'd forgotten about MAPREDUCE-3697. Glad to hear it's working.

        Show
        Tom White added a comment - Ah I'd forgotten about MAPREDUCE-3697 . Glad to hear it's working.
        Hide
        Steve Loughran added a comment -

        I got a stack trace trying to submit a pig job; the submission API has changed from the pig 0.10 library

        Show
        Steve Loughran added a comment - I got a stack trace trying to submit a pig job; the submission API has changed from the pig 0.10 library
        Hide
        Alejandro Abdelnur added a comment -

        Steve, AFAIK Pig uses the mapreduce API, the work done for binary compatibility only covers the mapred API.

        Show
        Alejandro Abdelnur added a comment - Steve, AFAIK Pig uses the mapreduce API, the work done for binary compatibility only covers the mapred API.
        Hide
        Zhijie Shen added a comment -

        The updated jdiff docs based 5/13/13 trunk and branch-1

        Show
        Zhijie Shen added a comment - The updated jdiff docs based 5/13/13 trunk and branch-1
        Hide
        Zhijie Shen added a comment -

        Close the ticket as all the subtasks are closed.

        Show
        Zhijie Shen added a comment - Close the ticket as all the subtasks are closed.

          People

          • Assignee:
            Zhijie Shen
            Reporter:
            Arun C Murthy
          • Votes:
            0 Vote for this issue
            Watchers:
            30 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development