Details
Description
We currently have `db.ignore.external.links` which is a nice way of restricting the crawl based on the hostname. This adds a new parameter 'db.ignore.external.links.domain' to do the same based on the domain.
Attachments
Attachments
Issue Links
- relates to
-
NUTCH-2365 HTTP Redirects to SubDomains don't get crawled if db.ignore.external.links.mode == byDomain
- Closed