Solr
  1. Solr
  2. SOLR-1397

It should be possible to highlight external text

    Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: highlighter
    • Labels:
      None

      Description

      Many sites don't store text in Lucene/Solr and so need a way to highlight text stored in a database (or some equivalent).

      As a workaround, FieldAnalysisRequestHandler can provide offsets from external text, but it does not support wildcard queries.

      1. TestExternalFieldProvider.java
        0.6 kB
        Jamie Johnson
      2. SolrHighlighter.java
        5 kB
        Jamie Johnson
      3. SolrExternalFieldProvider.java
        0.4 kB
        Jamie Johnson
      4. ExternalHighlighter.patch
        11 kB
        Jamie Johnson
      5. DefaultSolrHighlighter.java
        28 kB
        Jamie Johnson

        Issue Links

          Activity

          Hide
          Jamie Johnson added a comment -

          Attached is a first patch at adding the External Highlighter. I have not had a chance to write tests for this as of yet, but it's just meant to be a starting point. There were some changes to the DefaultHighlighter, so my changes didn't apply cleanly out of the box, but hopefully I've caught everything.

          To add an external provider just add this to the highlighter.

          solrconfig.xml
          
              <highlighting>
          		<externalFieldProvider default="true" name="text" class="somecustomfieldprovider">
                  	<str name="param1">value</str>
                </externalFieldProvider>
          
          Show
          Jamie Johnson added a comment - Attached is a first patch at adding the External Highlighter. I have not had a chance to write tests for this as of yet, but it's just meant to be a starting point. There were some changes to the DefaultHighlighter, so my changes didn't apply cleanly out of the box, but hopefully I've caught everything. To add an external provider just add this to the highlighter. solrconfig.xml <highlighting> <externalFieldProvider default = " true " name= "text" class= "somecustomfieldprovider" > <str name= "param1" >value</str> </externalFieldProvider>
          Hide
          Jamie Johnson added a comment -

          David, I looked at SOLR-1954 but after applying the patch to trunk the offsets that are returned span the full length of the field + the highlight tags, any idea why?

          Show
          Jamie Johnson added a comment - David, I looked at SOLR-1954 but after applying the patch to trunk the offsets that are returned span the full length of the field + the highlight tags, any idea why?
          Hide
          Jamie Johnson added a comment -

          Modified classes to support External Fields.

          The test class provided external field provider is very simple and always returns the same values, this was fine for my test since my test data always had the same value.

          Show
          Jamie Johnson added a comment - Modified classes to support External Fields. The test class provided external field provider is very simple and always returns the same values, this was fine for my test since my test data always had the same value.
          Hide
          David Smiley added a comment -

          This is related to SOLR-1954 which is my patch to expose highlighting offset in the highlighting component. This was used on a system that did on the fly PDF highlighting. It's rather complicated to explain, but all we really needed from Solr was the offsets, which it already new but didn't expose in the response.

          Show
          David Smiley added a comment - This is related to SOLR-1954 which is my patch to expose highlighting offset in the highlighting component. This was used on a system that did on the fly PDF highlighting. It's rather complicated to explain, but all we really needed from Solr was the offsets, which it already new but didn't expose in the response.
          Hide
          Mike Sokolov added a comment -

          I'm interested, but don't see the attachment?

          You can make a patch with svn, or in an IDE equipped with svn. From the top-level project folder run:

          svn diff > SOLR-1397.patch

          This assumes that the only changes in your project are related to the patch

          Show
          Mike Sokolov added a comment - I'm interested, but don't see the attachment? You can make a patch with svn, or in an IDE equipped with svn. From the top-level project folder run: svn diff > SOLR-1397 .patch This assumes that the only changes in your project are related to the patch
          Hide
          Jamie Johnson added a comment -

          I had a need for this as well and have put together an implementation that works for my use case. I've attached my implementation to this JIRA, I didn't know how to create a patch, but if someone has those details I'll do so.

          Show
          Jamie Johnson added a comment - I had a need for this as well and have put together an implementation that works for my use case. I've attached my implementation to this JIRA, I didn't know how to create a patch, but if someone has those details I'll do so.
          Hide
          Ryan McKinley added a comment -

          There has been interest for a LONG time... but it will take some non-trivial work to make it happen.

          Patches are always welcome!

          Show
          Ryan McKinley added a comment - There has been interest for a LONG time... but it will take some non-trivial work to make it happen. Patches are always welcome!
          Hide
          Michael Goddard added a comment -

          Agreed! I've just encountered a situation which begs for the capability to access the Solr instance that provided my results, which include the primary key for the RDBMS table and the score. I would pull the text from the RDBMS table and, hopefully, do an HTTP GET to the Solr highlight service with the query and this text, and Solr would return the text, highlighted.

          Are any other folks interested? Has there been any work done along these lines?

          Show
          Michael Goddard added a comment - Agreed! I've just encountered a situation which begs for the capability to access the Solr instance that provided my results, which include the primary key for the RDBMS table and the score. I would pull the text from the RDBMS table and, hopefully, do an HTTP GET to the Solr highlight service with the query and this text, and Solr would return the text, highlighted. Are any other folks interested? Has there been any work done along these lines?

            People

            • Assignee:
              Unassigned
              Reporter:
              Anders Melchiorsen
            • Votes:
              14 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

              • Created:
                Updated:

                Development