Details
Description
New indexing framework for Nutch that provides a more generic field abstraction consistent with Lucene index semantics. Allows multiple MR jobs to be created for different fields and those fields to be aggregated and indexed in the end. Overcomes limitations of the current indexer that limits what databases are passed into the indexer. Creates a new extension point as well for field-filters for manipulation of fields during the indexing process.
Attachments
Attachments
Issue Links
- depends upon
-
NUTCH-635 LinkAnalysis Tool for Nutch
- Closed