Indexers should be able to normalize URL's. This is useful when a new normalizer is applied to the entire CrawlDB. Without it, some or all records in a segment cannot be indexed at all.
- is depended upon by
- is duplicated by
NUTCH-1614 Plugin to exclude URLs matching regex list from indexing - to enable crawl but do not index