Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Won't Fix
-
1.10
-
None
-
None
-
None
-
Patch Available
Description
The filter allows all protocols for all whitelisted domains, hosts or suffixes but it usually makes little sense to index both http and https URL's of the same domain. This is not unlike the host URL filter, which prevents indexing of duplicate hosts e.g. apache.org and www.apache.org.
Attachments
Attachments
Issue Links
- contains
-
NUTCH-2189 Domain filter must deactivate if no rules are present
- Closed
- is superceded by
-
NUTCH-2190 Protocol normalizer
- Closed