Solr
  1. Solr
  2. SOLR-925

Highlighter doesn't work on a field which is multiValued="true" and termOffsets="true"

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.3
    • Fix Version/s: 1.4
    • Component/s: highlighter
    • Labels:
      None

      Description

      This seems to be introduced at r674677.

      java.lang.StringIndexOutOfBoundsException: String index out of range: 15
      at java.lang.String.substring(Unknown Source)
      at org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:239)
      at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:310)
      at org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:83)
      at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:171)
      at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
      at org.apache.solr.core.SolrCore.execute(SolrCore.java:1313)
      at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
      at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
      at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
      at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
      at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
      at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
      at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
      at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
      at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
      at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
      at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
      at org.mortbay.jetty.Server.handle(Server.java:285)
      at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
      at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
      at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
      at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
      at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
      at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
      at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)

      1. SOLR-925.patch
        2 kB
        Koji Sekiguchi

        Activity

        Hide
        Steffen Baumgart added a comment -

        Can be replicated with the example-setup that comes with Solr (Jetty + example docs):
        http://localhost:8983/solr/select/?q=iPOd%20video&version=2.2&start=0&rows=10&indent=on&hl=true&qt=dismax

        Show
        Steffen Baumgart added a comment - Can be replicated with the example-setup that comes with Solr (Jetty + example docs): http://localhost:8983/solr/select/?q=iPOd%20video&version=2.2&start=0&rows=10&indent=on&hl=true&qt=dismax
        Hide
        Mark Miller added a comment -

        I havn't looked into this, so I don't know for sure, but there has always been issues with offsets and multi fields with Lucene (that have affected the Highlighter of course). I'm hoping that LUCENE-1448 is going to straighten that all right up (so hopefully this too?).

        Show
        Mark Miller added a comment - I havn't looked into this, so I don't know for sure, but there has always been issues with offsets and multi fields with Lucene (that have affected the Highlighter of course). I'm hoping that LUCENE-1448 is going to straighten that all right up (so hopefully this too?).
        Hide
        Koji Sekiguchi added a comment -

        Thanks Mark for the input, but this bug introduced when fixing SOLR-556.
        I think the attached patch fixes the problem.

        Show
        Koji Sekiguchi added a comment - Thanks Mark for the input, but this bug introduced when fixing SOLR-556 . I think the attached patch fixes the problem.
        Hide
        Koji Sekiguchi added a comment -

        All tests pass. I plan to commit in a few days. Meanwhile I'll look into test cases for this bug.

        Show
        Koji Sekiguchi added a comment - All tests pass. I plan to commit in a few days. Meanwhile I'll look into test cases for this bug.
        Hide
        David Bowen added a comment -

        -1 on this patch. If you try this example (using the example as in ant run-example):

        <add><doc>
        <field name="id">Test for Highlighting StringIndexOutOfBoundsExcdption</field>
        <field name="name">Some Name</field>
        <field name="manu">Acme, Inc.</field>
        <field name="features">Description of the features, mentioning various things</field>
        <field name="features">Features also is multivalued</field>
        <field name="popularity">6</field>
        <field name="inStock">true</field>
        </doc></add>

        then this url

        http://localhost:8983/solr/select/?q=features&hl=true&hl.snippets=2&hl.fl=features&hl.fragsize=0

        shows correct highlighting in the first snippet, but highlighting is shifted right about 19 characters in the second.

        Show
        David Bowen added a comment - -1 on this patch. If you try this example (using the example as in ant run-example): <add><doc> <field name="id">Test for Highlighting StringIndexOutOfBoundsExcdption</field> <field name="name">Some Name</field> <field name="manu">Acme, Inc.</field> <field name="features">Description of the features, mentioning various things</field> <field name="features">Features also is multivalued</field> <field name="popularity">6</field> <field name="inStock">true</field> </doc></add> then this url http://localhost:8983/solr/select/?q=features&hl=true&hl.snippets=2&hl.fl=features&hl.fragsize=0 shows correct highlighting in the first snippet, but highlighting is shifted right about 19 characters in the second.
        Hide
        Koji Sekiguchi added a comment -

        David, how snippets look like you are getting?
        I use your sample doc and url, then I got:

        <lst name="highlighting">
         <lst name="Test for Highlighting StringIndexOutOfBoundsExcdption">
          <arr name="features">
        
        	<str>Description of the &lt;em&gt;features&lt;/em&gt;, mentioning various things</str>
        	<str>&lt;em&gt;Features&lt;/em&gt; also is multivalued</str>
          </arr>
         </lst>
        
        </lst>
        
        Show
        Koji Sekiguchi added a comment - David, how snippets look like you are getting? I use your sample doc and url, then I got: <lst name= "highlighting" > <lst name= "Test for Highlighting StringIndexOutOfBoundsExcdption" > <arr name= "features" > <str> Description of the &lt;em&gt;features&lt;/em&gt;, mentioning various things </str> <str> &lt;em&gt;Features&lt;/em&gt; also is multivalued </str> </arr> </lst> </lst>
        Hide
        David Bowen added a comment -

        That's strange. I get this:

        <lst name="highlighting">
          <lst name="Test for Highlighting StringIndexOutOfBoundsExcdption">
            <arr name="features">
              <str>Description of the <em>features</em>, mentioning various things</str>
              <str>Features also is mu<em>ltivalue</em>d</str>
            </arr>
          </lst>
        </lst>
        
        Show
        David Bowen added a comment - That's strange. I get this: <lst name="highlighting"> <lst name="Test for Highlighting StringIndexOutOfBoundsExcdption"> <arr name="features"> <str>Description of the <em>features</em>, mentioning various things</str> <str>Features also is mu<em>ltivalue</em>d</str> </arr> </lst> </lst>
        Hide
        David Bowen added a comment -

        Ack! Sorry Koji, my mistake. I was inadvertently using a patched Lucene highlighter class which apparently causes this problem.

        +1 on committing this patch.

        Show
        David Bowen added a comment - Ack! Sorry Koji, my mistake. I was inadvertently using a patched Lucene highlighter class which apparently causes this problem. +1 on committing this patch.
        Hide
        Koji Sekiguchi added a comment -

        Committed revision 729450 w/ some test cases.

        Show
        Koji Sekiguchi added a comment - Committed revision 729450 w/ some test cases.
        Hide
        Grant Ingersoll added a comment -

        Bulk close for Solr 1.4

        Show
        Grant Ingersoll added a comment - Bulk close for Solr 1.4

          People

          • Assignee:
            Koji Sekiguchi
            Reporter:
            Koji Sekiguchi
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development