Uploaded image for project: 'Oozie'
  1. Oozie
  2. OOZIE-555

(H | M) Support multiple versions of jar



    • Task
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • None
    • None
    • None


      Currently oozie supports 3 ways of including jar files for any workflow. All these cases are getting file from hdfs
      directory add those jar into distributed cache.
      1. System supported jars.
      2. User-specific common jars.
      3. Workflow specific jar.

      Current Shortcomings:
      However, if the system wants to support multiple version of the same product (such as pig or hive), it is not possible
      in current oozie. In addition, oozie user often includes very old or non-supported jars (mostly unintended). This
      creates a lot of support issues. Id oozie could restrict (in some extent) the usage of version, this frequent support
      overhead could be minimized.

      This JIRA is created for the following purposes:
      1. Support multiple version of jars. In reality, for instance, there are multiple active version of pig jar. One might
      be in 'stable' and another might be immediate next version.

      2. Enforce the usage of supported jars. For example, user could configure to use specific version of pig. If user
      doesn't provide any configuration, oozie will pick up the most stable jar (system configurable).

      It is important to note, that this proposed feature will not fully remove the usage of unsupported version of jar.

      Design Details:

      1. Every product could have a product specific sub-directory in the system lib dir. In that sub-directory, there could
      be multiple versions of jars. For example: <SYSTEM_LIB>/pig/0.7/lib, <SYSTEM_LIB>/pig/0.8/lib and

      2. Need to modify the current way of including jar from SYSTEM_LIB.

      3. In addition of including SYSTEM_LIB jars, oozie need to include the jar from the user-selected (or default) version.
      For example, if a user configure to use pig 0.8, oozie should include jar from <SYSTEM_LIB>/pig/0.8/lib/.jar. And
      if user doesn't configure for any specific pig version, oozie should include <SYSTEM_LIB>/pig/stable/lib/.jar.

      4. If user specified some unsupported version jar, Oozie should throw exception with appropriate error message.

      5. Oozie should include the product specific jar when it asked for that product. For example, oozie should include the
      pig jar through PigActionExecutor and hive jars through HiveActionExecutor. As a side effect,it will reduce the number
      of jars included in the Distributed Cache by selectively including the appropriate jar.

      6. It will be the SE/OPS responsibility to maintain the supported versions of lib directories for any product.

      Implementation Details:
      How to implement it in current Oozie?

      Once we agreed on the conceptual part, we could do the following changes in the code.

      1. In place of putting all APP_LIB_PATH into WorkflowAppService.APP_LIB_PATH_LIST, we could create two such list.
      WorkflowAppService.SYSTEM_LIB_PATH_LIST that will contain only <SYSTEM_LIB/*.jar>.
      WorkflowAppService.APP_LIB_PATH_LIST will hold the rest (user-specific and wf/lib/.)

      2. Modify JavaActionExecutor:setLibFilesArchives()
      String[] paths = getLibPaths(); //New method

      if (paths != null) {
      for (String path : paths)

      { addToCache(conf, appPath, path, false); }


      //New method :Base implementation
      protected String[] getLibPaths(...)

      { String[] paths = proto.getStrings(WorkflowAppService.APP_LIB_PATH_LIST); return paths; }

      3. For example, in PigActionExecutor getLibPaths could be overridden using the following pseudo-code:

      protected String[] getLibPaths(...) {
      String paths[] = super.getLibPaths();
      String path = services.getConf().get(SYSTEM_LIB_PATH, " ");
      if (path.trim().length() > 0)

      { systemLibPath = new Path(path.trim()); }


      { return ..}

      String pigHome = systemLibPath = "/pig/" + usedVersion;
      List<String> libPaths = getLibFiles(fs, systemLibPath);
      return paths + libPaths;


        Issue Links



              Unassigned Unassigned
              virag Virag Kothari
              0 Vote for this issue
              1 Start watching this issue