Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Resolved
-
None
-
None
-
None
Description
The Hadoop libs are not included in the job file as a Hadoop cluster must already be available in order to use it, however some of its transitive dependencies make it to the job file. We already prevent some but could extend that to :
<exclude org="org.mortbay.jetty"/>
<exclude org="com.sun.jersey"/>
<exclude org="tomcat"/>
Note that we need some of the Hadoop classes and dependencies in order to run Nutch in local mode.
Alternatively we could have a separate Ivy profile only for Hadoop and store the dependencies in a separate location so that they do not get copied to the job jar, however this is probably an overkill if the dependencies above are not needed when running in local mode.