Details
Description
At the moment, the “content” field holds only the parsed text from the page. It would be nice to have a separate field in Solr document that would hold raw HTML from the crawled page.
Attachments
Issue Links
- is related to
-
NUTCH-1785 Ability to index raw content
- Closed