Details
Description
This is a very minor issue with in Subcollection.java. It throws an error if the (empty) blacklist element was omitted. I think it should either not silently fail in case of an omitted blacklist element or throw a decent error message that the blacklist element is required. The following exception gets thrown if the blacklist element is omitted in a subcollection block:
2010-09-06 13:32:30,438 INFO collection.CollectionManager - Instantiating CollectionManager
2010-09-06 13:32:30,438 INFO collection.CollectionManager - initializing CollectionManager
2010-09-06 13:32:30,451 INFO collection.CollectionManager - file has1 elements
2010-09-06 13:32:30,456 WARN collection.CollectionManager - Error occured:java.lang.NullPointerException
2010-09-06 13:32:30,469 WARN collection.CollectionManager - java.lang.NullPointerException
2010-09-06 13:32:30,470 WARN collection.CollectionManager - at org.apache.nutch.collection.Subcollection.initialize(Subcollection.java:173)
2010-09-06 13:32:30,470 WARN collection.CollectionManager - at org.apache.nutch.collection.CollectionManager.parse(CollectionManager.java:98)
2010-09-06 13:32:30,470 WARN collection.CollectionManager - at org.apache.nutch.collection.CollectionManager.init(CollectionManager.java:75)
2010-09-06 13:32:30,470 WARN collection.CollectionManager - at org.apache.nutch.collection.CollectionManager.<init>(CollectionManager.java:56)
2010-09-06 13:32:30,471 WARN collection.CollectionManager - at org.apache.nutch.collection.CollectionManager.getCollectionManager(CollectionManager.java:115)
2010-09-06 13:32:30,471 WARN collection.CollectionManager - at org.apache.nutch.indexer.subcollection.SubcollectionIndexingFilter.addSubCollectionField(SubcollectionIndexingFilter.java:65)
2010-09-06 13:32:30,471 WARN collection.CollectionManager - at org.apache.nutch.indexer.subcollection.SubcollectionIndexingFilter.filter(SubcollectionIndexingFilter.java:71)
2010-09-06 13:32:30,471 WARN collection.CollectionManager - at org.apache.nutch.indexer.IndexingFilters.filter(IndexingFilters.java:109)
2010-09-06 13:32:30,471 WARN collection.CollectionManager - at org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:134)
2010-09-06 13:32:30,472 WARN collection.CollectionManager - at org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:50)
2010-09-06 13:32:30,472 WARN collection.CollectionManager - at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463)
2010-09-06 13:32:30,472 WARN collection.CollectionManager - at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
2010-09-06 13:32:30,472 WARN collection.CollectionManager - at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)