Description
NUTCH-2337 introduced a potential (though rare) NullPointerException when an ill-formed URL (just the protocol followed by ":", ":/", ":////" or even more slashes):
% echo "http://///" \ | runtime/local/bin/nutch org.apache.nutch.net.URLNormalizerChecker \ -normalizer org.apache.nutch.net.urlnormalizer.basic.BasicURLNormalizer Checking URLNormalizer org.apache.nutch.net.urlnormalizer.basic.BasicURLNormalizer Exception in thread "main" java.lang.NullPointerException at org.apache.nutch.net.urlnormalizer.basic.BasicURLNormalizer.normalize(BasicURLNormalizer.java:120) at org.apache.nutch.net.URLNormalizerChecker.checkOne(URLNormalizerChecker.java:72) at org.apache.nutch.net.URLNormalizerChecker.main(URLNormalizerChecker.java:110)
Attachments
Issue Links
- links to