Issue Details (XML | Word | Printable)

Key: LUCENE-1285
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Otis Gospodnetic
Reporter: Andrzej Bialecki
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Lucene - Java

WeightedSpanTermExtractor incorrectly treats the same terms occurring in different query types

Created: 15/May/08 01:01 PM   Updated: 11/Oct/08 12:49 PM
Return to search
Component/s: contrib/highlighter
Affects Version/s: 2.4
Fix Version/s: 2.4

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works highlighter-test.patch 2008-05-15 01:51 PM Mark Miller 1 kB
Text File Licensed for inclusion in ASF works highlighter.patch 2008-05-15 01:14 PM Andrzej Bialecki 3 kB
Issue Links:
Reference
 

Lucene Fields: New, Patch Available
Resolution Date: 27/May/08 04:10 PM


 Description  « Hide
Given a BooleanQuery with multiple clauses, if a term occurs both in a Span / Phrase query, and in a TermQuery, the results of term extraction are unpredictable and depend on the order of clauses. Concequently, the result of highlighting are incorrect.

Example text: t1 t2 t3 t4 t2
Example query: t2 t3 "t1 t2"
Current highlighting: [t1 t2] [t3] t4 t2
Correct highlighting: [t1 t2] [t3] t4 [t2]

The problem comes from the fact that we keep a Map<termText, WeightedSpanTerm>, and if the same term occurs in a Phrase or Span query the resulting WeightedSpanTerm will have a positionSensitive=true, whereas terms added from TermQuery have positionSensitive=false. The end result for this particular term will depend on the order in which the clauses are processed.

My fix is to use a subclass of Map, which on put() always sets the result to the most lax setting, i.e. if we already have a term with positionSensitive=true, and we try to put() a term with positionSensitive=false, we set the result positionSensitive=false, as it will match both cases.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
No work has yet been logged on this issue.