Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-6139

TokenGroup.getStart|EndOffset should return matchStart|EndOffset not start|endOffset

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Fixed
    • None
    • 5.0, 6.0
    • modules/highlighter
    • None
    • New

    Description

      The default highlighter has a TokenGroup class that is passed to Formatter.highlightTerm(). TokenGroup also has getStartOffset() and getEndOffset() methods that ostensibly return the start and end offsets into the original text of the current term. These getters aren't called by Lucene or Solr but they are made available and are useful to me. The problem is that they return the wrong offsets when there are tokens at the same position. I believe this was an oversight of LUCENE-627 in which these getters should have been updated but weren't. The fix is simple: return matchStartOffset and matchEndOffset from these getters, not startOffset and endOffset. I think this oversight would not have occurred if Highlighter didn't have package-access to TokenGroup's fields.

      Attachments

        Issue Links

          Activity

            People

              dsmiley David Smiley
              dsmiley David Smiley
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: