Description
Indexers should be able to normalize URL's. This is useful when a new normalizer is applied to the entire CrawlDB. Without it, some or all records in a segment cannot be indexed at all.
Attachments
Attachments
Issue Links
- is depended upon by
-
NUTCH-1323 AjaxNormalizer
- Closed
- is duplicated by
-
NUTCH-1614 Plugin to exclude URLs matching regex list from indexing - to enable crawl but do not index
- Open