Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.19
-
None
-
Patch Available
Description
The plugin urlfilter-validator is activated by default (in nutch-default.xml) but has two major issues which may confuse users of Nutch:
- single-part domain names (localhost, etc.) are not allowed (
NUTCH-2973) - IPv6 host names are rejected as invalid (NUTCH-2705)
What about disabling it by default to overcome these issues?
Attachments
Issue Links
- supercedes
-
NUTCH-2973 Single domain names (eg https://localnet) can't be crawled - filtering fails
- Closed
- links to