Description
Some use-cases require Nutch to actually write the raw content a configured indexing back-end. Since Content is never read, a plugin is out of the question and therefore we need to force IndexJob to process Content as well.
Attachments
Attachments
Issue Links
- duplicates
-
NUTCH-2032 Plugin to index the raw content of a readable document.
- Closed
- is duplicated by
-
NUTCH-2032 Plugin to index the raw content of a readable document.
- Closed
- relates to
-
NUTCH-2254 Charset issues when using -addBinaryContent and -base64 options
- Closed
-
NUTCH-1458 Support for raw HTML field added to Solr
- Closed
-
NUTCH-1944 Add raw content to indexes
- Closed