Solr
  1. Solr
  2. SOLR-1605

ExtractingRequestHandler does not embed original document

    Details

      Description

      The ExtractingRequestHandler does not have the option to embed the original document file as a saved field.

      This would be generally useful for content management system purposes, since the search index can also directly serve the content making for a much simpler system architecture.

      My use case is to highlight indexed HTML. Since the raw HTML text is not indexed, it is not possible to request it highlighted.

        Activity

        Hide
        Jan Høydahl added a comment -

        I'm not sure this is a great idea. You could add an option to store the source as a BinaryField or something, but what good does it do to have a 500Mb media file in your index? Or do you want to store the parsed and structured XHTML output from Tika in a stored field? I'm afraid that output is not meant for pretty display.

        Show
        Jan Høydahl added a comment - I'm not sure this is a great idea. You could add an option to store the source as a BinaryField or something, but what good does it do to have a 500Mb media file in your index? Or do you want to store the parsed and structured XHTML output from Tika in a stored field? I'm afraid that output is not meant for pretty display.

          People

          • Assignee:
            Unassigned
            Reporter:
            Lance Norskog
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:

              Development