Details
Description
Each executed job results in a number of occurences of the exception below:
2011-01-27 13:40:34,457 ERROR conf.Configuration - Failed to set setXIncludeAware(true) for parser org.apache.xerces.jaxp.DocumentBuilderFactoryImpl@3801318b:java.lang.UnsupportedOperationException: This parser does not support specification "null" version "null"
java.lang.UnsupportedOperationException: This parser does not support specification "null" version "null"
at javax.xml.parsers.DocumentBuilderFactory.setXIncludeAware(DocumentBuilderFactory.java:590)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1054)
at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1040)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:980)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:436)
at org.apache.hadoop.fs.FileSystem.getDefaultUri(FileSystem.java:103)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)
at org.apache.nutch.crawl.Injector.inject(Injector.java:230)
at org.apache.nutch.crawl.Injector.run(Injector.java:248)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.Injector.main(Injector.java:238)
This can be fixed by upgrading xercesImpl from 2.6.2 to 2.9.1. If modified ivy and lib-xml's ivy configuration and can commit it. The question is, is upgrading the correct method? I've tested Nutch with 2.9.1 and except the lack of the annoying exception everything works as expected.