[LUCENE-2880] SpanQuery scoring inconsistencies - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 5.3
Component/s: None
Labels:
None

Lucene Fields:

New

Description

Spinoff of ~~LUCENE-2879~~.

You can see a full description there, but the gist is that SpanQuery sums up freqs with "sloppyFreq".
However this slop is simply spans.end() - spans.start()

For a SpanTermQuery for example, this means its scoring 0.5 for TF versus TermQuery's 1.0.
As you can imagine, I think in practical situations this would make it difficult for SpanQuery users to
really use SpanQueries for effective ranking, especially in combination with non-Spanqueries (maybe via DisjunctionMaxQuery, etc)

The problem is more general than this simple example: for example SpanNearQuery should be consistent with PhraseQuery's slop.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

LUCENE-2880.patch
23/Jan/11 15:10
13 kB
Robert Muir
LUCENE-2880.patch
17/Jun/15 06:52
10 kB
Adrien Grand

Issue Links

is related to

LUCENE-533 SpanQuery scoring: SpanWeight lacks a recursive traversal of the query tree

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Robert Muir

Votes:: 1 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 23/Jan/11 15:08

Updated:: 28/Aug/22 12:39

Resolved:: 18/Jun/15 20:51