Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Activity

      Hide
      Rohini Palaniswamy added a comment -

      I think we should create a separate Release section where major features of each release is listed (giving links to relevant documentation sections) along with upgrade steps and backward incompatible changes. There is no single place to go check what is available in a release.

      Show
      Rohini Palaniswamy added a comment - I think we should create a separate Release section where major features of each release is listed (giving links to relevant documentation sections) along with upgrade steps and backward incompatible changes. There is no single place to go check what is available in a release.
      Hide
      Robert Kanter added a comment -

      Thanks Rohini, that's a great overview of the sharelib changes!

      I think it would be useful to add this somewhere in the docs to give users a clear overview of the sharelib changes.

      Show
      Robert Kanter added a comment - Thanks Rohini, that's a great overview of the sharelib changes! I think it would be useful to add this somewhere in the docs to give users a clear overview of the sharelib changes.
      Hide
      Rohini Palaniswamy added a comment -

      Currently Oozie provides support for user sharing action (pig/hive/hbase/hcatalog/distcp) jars through sharelib.

      Current limitations:

      • The sharelib directory is created under workflow system lib path. During oozie upgrade or a pig upgrade, the sharelib directory is deleted and recreated. This makes all the existing running jobs fail because Hadoop fails any jobs if the jars shipped using Distributed Cache have changed before the job completes. Having to rerun those jobs wastes valuable cluster resources and causes sla misses.

      Features:

      Sharelib Creation:

      • Admin runs sharelib create command as usual. But the command copies the sharelib contents into a new timestamped version of the sharelib directory (lib_<timestamp>) under the system lib path directory. Previously the contents where copied directly under system lib path directory.
      • Oozie on startup picks up the latest sharelib directory. It purges sharelib directories older than 7 days (configurable) except the previous latest one.
      • Contents of sharelib can viewed by running "oozie admin -shareliblist" command.

      Sharelib Updation:

      • While oozie server is up, if there are newer versions of pig or hive jars available it should be possible for oozie to switch to the latest one without restarting oozie. This is done by admin running the sharelib create command again with the newer version of sharelib. Then "oozie admin -sharelibupdate" command can be issued to the server to make it pick up the latest sharelib. If it is a HA environment it will be propagated to all servers. As the previous sharelib is still not deleted, jobs which started with those jars will run successfully to completion.

      Sharelib Meta File:

      • This is an alternative to the sharelib directory. If you already have pig,hive,etc jars installed in some hdfs locations, then this property file can be used to specify those locations. Newer versions can be picked up by updating this file and invoking "oozie admin -sharelibupdate".

      Shipping Launcher jars:

      • By default, it is expected that the sharelib for particular action contains the oozie launcher jar as well. For eg: pig action sharelib should have oozie-sharelib-pig-<oozie version>.jar. This is automatically there when you use the sharelib tar file created by oozie. If you have your own setup with just the pig jars, then you can set oozie.action.ship.launcher.jar=true. In that case, oozie will automatically ship the launcher jars (from the tomcat webapp WEB-INF/lib directory) to hdfs on startup into launcher_<timestamp> directory under the system lib path directory. When launching jobs, it includes them in distributed cache for corresponding action. Purging of older launcher_<timestamp> directories is done during startup similar to lib_<timestamp> directories.
      Show
      Rohini Palaniswamy added a comment - Currently Oozie provides support for user sharing action (pig/hive/hbase/hcatalog/distcp) jars through sharelib. Current limitations: The sharelib directory is created under workflow system lib path. During oozie upgrade or a pig upgrade, the sharelib directory is deleted and recreated. This makes all the existing running jobs fail because Hadoop fails any jobs if the jars shipped using Distributed Cache have changed before the job completes. Having to rerun those jobs wastes valuable cluster resources and causes sla misses. Features: Sharelib Creation: Admin runs sharelib create command as usual. But the command copies the sharelib contents into a new timestamped version of the sharelib directory (lib_<timestamp>) under the system lib path directory. Previously the contents where copied directly under system lib path directory. Oozie on startup picks up the latest sharelib directory. It purges sharelib directories older than 7 days (configurable) except the previous latest one. Contents of sharelib can viewed by running "oozie admin -shareliblist" command. Sharelib Updation: While oozie server is up, if there are newer versions of pig or hive jars available it should be possible for oozie to switch to the latest one without restarting oozie. This is done by admin running the sharelib create command again with the newer version of sharelib. Then "oozie admin -sharelibupdate" command can be issued to the server to make it pick up the latest sharelib. If it is a HA environment it will be propagated to all servers. As the previous sharelib is still not deleted, jobs which started with those jars will run successfully to completion. Sharelib Meta File: This is an alternative to the sharelib directory. If you already have pig,hive,etc jars installed in some hdfs locations, then this property file can be used to specify those locations. Newer versions can be picked up by updating this file and invoking "oozie admin -sharelibupdate". Shipping Launcher jars: By default, it is expected that the sharelib for particular action contains the oozie launcher jar as well. For eg: pig action sharelib should have oozie-sharelib-pig-<oozie version>.jar. This is automatically there when you use the sharelib tar file created by oozie. If you have your own setup with just the pig jars, then you can set oozie.action.ship.launcher.jar=true. In that case, oozie will automatically ship the launcher jars (from the tomcat webapp WEB-INF/lib directory) to hdfs on startup into launcher_<timestamp> directory under the system lib path directory. When launching jobs, it includes them in distributed cache for corresponding action. Purging of older launcher_<timestamp> directories is done during startup similar to lib_<timestamp> directories.

        People

        • Assignee:
          Purshotam Shah
          Reporter:
          Purshotam Shah
        • Votes:
          0 Vote for this issue
          Watchers:
          4 Start watching this issue

          Dates

          • Created:
            Updated:

            Development