Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-6139

TokenGroup.getStart|EndOffset should return matchStart|EndOffset not start|endOffset

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.0, 6.0
    • Component/s: modules/highlighter
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      The default highlighter has a TokenGroup class that is passed to Formatter.highlightTerm(). TokenGroup also has getStartOffset() and getEndOffset() methods that ostensibly return the start and end offsets into the original text of the current term. These getters aren't called by Lucene or Solr but they are made available and are useful to me. The problem is that they return the wrong offsets when there are tokens at the same position. I believe this was an oversight of LUCENE-627 in which these getters should have been updated but weren't. The fix is simple: return matchStartOffset and matchEndOffset from these getters, not startOffset and endOffset. I think this oversight would not have occurred if Highlighter didn't have package-access to TokenGroup's fields.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                dsmiley David Smiley
                Reporter:
                dsmiley David Smiley
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: