Description
Some use-cases require Nutch to actually write the raw content a configured indexing back-end. Since Content is never read, a plugin is out of the question and therefore we need to force IndexJob to process Content as well.
Attachments
Attachments
Issue Links
- duplicates
-
NUTCH-2032 Plugin to index the raw content of a readable document.
-
- Closed
-
- is duplicated by
-
NUTCH-2032 Plugin to index the raw content of a readable document.
-
- Closed
-
- relates to
-
NUTCH-2254 Charset issues when using -addBinaryContent and -base64 options
-
- Closed
-
-
NUTCH-1944 Add raw content to indexes
-
- Resolved
-
-
NUTCH-1458 Support for raw HTML field added to Solr
-
- Closed
-