Uploaded image for project: 'Oozie'
  1. Oozie
  2. OOZIE-1954

Add a way for the MapReduce action to be configured by Java code

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • trunk
    • 4.2.0
    • None
    • None

    Description

      With certain other components (e.g. Avro, HFileOutputFormat (HBase), etc), it becomes impractical to use the MapReduce action and users must instead use the Java action. The problem is that these components require a lot of extra configuration that is often hidden from the user in Java code (e.g. HFileOutputFormat.configureIncrementalLoad(job, table); which can also include decision logic, serialization, and other things that we can't do in an XML file directly.

      One way to solve this problem is to allow the user to give the MR action some Java code that would do this configuration, similar to how we allow the <job-xml> field to specify an external XML file of configuration properties.
      In more detail, we could have an interface; something like this:

      public interface OozieActionConfigurator {
           public void updateOozieActionConfiguration(Configuration conf);
      }
      

      that the user can implement, create a jar, and include with their MR action (i.e. add a "<config-class>" field that let's them specify the class name). To protect the Oozie server from running user code (which could do anything it wants really), it would have to be run in the Launcher Job. The Launcher Job could call this method after it loads the configuration prepared by the Oozie server.

      Another thing this will be helpful is with users who use the Java action to launch MR jobs and expect a bunch of things to be done for them that are not (e.g. delegation token propagation, config loading, returning the hadoop job to Oozie, etc). These are all done with the MR action, so the more users we can move to the MR action from the Java action, the less they'll run into these difficulties.

      Some of this may change slightly as I try to actually implement this (e.g. have to handle throwing exceptions etc). And one thing I may do is keep this general enough that it should be compatible with all action types in case we want to add this to any of them in the future; though for now, the schema would only accept it for the MapReduce action.

      Attachments

        1. OOZIE-1954.patch
          39 kB
          Robert Kanter
        2. OOZIE-1954.patch
          38 kB
          Robert Kanter
        3. OOZIE-1954.patch
          39 kB
          Robert Kanter

        Issue Links

          Activity

            People

              rkanter Robert Kanter
              rkanter Robert Kanter
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: