Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1805

Remove unnecessary transitive dependencies from Hadoop core

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Resolved
    • None
    • None
    • build
    • None

    Description

      The Hadoop libs are not included in the job file as a Hadoop cluster must already be available in order to use it, however some of its transitive dependencies make it to the job file. We already prevent some but could extend that to :
      <exclude org="org.mortbay.jetty"/>
      <exclude org="com.sun.jersey"/>
      <exclude org="tomcat"/>
      Note that we need some of the Hadoop classes and dependencies in order to run Nutch in local mode.

      Alternatively we could have a separate Ivy profile only for Hadoop and store the dependencies in a separate location so that they do not get copied to the job jar, however this is probably an overkill if the dependencies above are not needed when running in local mode.

      Attachments

        Activity

          People

            Unassigned Unassigned
            jnioche Julien Nioche
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: