Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-1622

Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.17.0
    • Component/s: None
    • Labels:
      None
    • Release Note:
      Hide
      This patch allows new command line options for

      hadoop jar
      which are

      hadoop jar -files <comma seperated list of files> -libjars <comma seperated list of jars> -archives <comma seperated list of archives>

      -files options allows you to speficy comma seperated list of path which would be present in your current working directory of your task
      -libjars option allows you to add jars to the classpaths of the maps and reduces.
      -archives allows you to pass archives as arguments that are unzipped/unjarred and a link with name of the jar/zip are created in the current working directory if tasks.
      Show
      This patch allows new command line options for hadoop jar which are hadoop jar -files <comma seperated list of files> -libjars <comma seperated list of jars> -archives <comma seperated list of archives> -files options allows you to speficy comma seperated list of path which would be present in your current working directory of your task -libjars option allows you to add jars to the classpaths of the maps and reduces. -archives allows you to pass archives as arguments that are unzipped/unjarred and a link with name of the jar/zip are created in the current working directory if tasks.

      Description

      More likely than not, a user's job may depend on multiple jars.
      Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that.
      A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar.
      This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function
      (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too.
      It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time
      of job submission. Someting like:

      bin/hadoop .... --depending_jars j1.jar:j2.jar

        Attachments

        1. HADOOP-1622_1.patch
          20 kB
          Mahadev konar
        2. HADOOP-1622_2.patch
          28 kB
          Mahadev konar
        3. HADOOP-1622_3.patch
          29 kB
          Mahadev konar
        4. HADOOP-1622_4.patch
          28 kB
          Mahadev konar
        5. HADOOP-1622_5.patch
          30 kB
          Mahadev konar
        6. HADOOP-1622_6.patch
          30 kB
          Mahadev konar
        7. hadoop-1622-4-20071008.patch
          48 kB
          Dennis Kubes
        8. HADOOP-1622-5.patch
          46 kB
          Doug Cutting
        9. HADOOP-1622-6.patch
          46 kB
          Doug Cutting
        10. HADOOP-1622-7.patch
          44 kB
          Doug Cutting
        11. HADOOP-1622-8.patch
          45 kB
          Dennis Kubes
        12. HADOOP-1622-9.patch
          46 kB
          Dennis Kubes
        13. multipleJobJars.patch
          8 kB
          Dennis Kubes
        14. multipleJobResources.patch
          43 kB
          Dennis Kubes
        15. multipleJobResources2.patch
          44 kB
          Dennis Kubes

          Issue Links

            Activity

              People

              • Assignee:
                mahadev Mahadev konar
                Reporter:
                runping Runping Qi
              • Votes:
                0 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: