Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
1.9
-
None
Description
In CrawlDbFilter.java there is a line for getting a boolean that sets if the filters are going to be applied or not:
public static final String URL_FILTERING = "crawldb.url.filters";
However in nutch-default.xml that property is not present. Currently the only way to set this value is using the -filter parameter from the command line.
The same applies to:
public static final String URL_NORMALIZING = "crawldb.url.normalizers";
public static final String URL_NORMALIZING_SCOPE = "crawldb.url.normalizers.scope";