[JCR-3478] Partial search terms matching fails when there is a lot of matching content outside the query's scope - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.4.4, 2.5.3
Component/s: jackrabbit-core
Labels:
None

Description

This continues the work from ~~JCR-3428~~.

It appears that if we are dealing with a full-text search 'ipsu*', the WildcardQueryRewrite will generate a list of matching tokens to use as the query condition based on all of the matching tokens found in the index, not just the ones that fall into the query's scope.

This list will next be used in the Excerpt generation, with a 'must all match' condition, which will make the excerpts not work.

For example if we have the following content:
/
/testNode1 with the property 'text'='lorem ipsum'
/testNode2 with the property 'foo'='ipsuFoo'
/testNode3 with the property 'bar'='ipsuBar'

and the query testNode1//*[jcr:contains(., 'ipsu*')]/rep:excerpt(.)

What will happen is the WildcardQueryRewrite will extract 3 terms for the highlighter: ipsum, ipsuFoo and ipsuBar, wich will be passed as a single list of terms, basically a 'must all match' condition.

What I want to do is break this list into a list of 3 sets each containing a single term, turning it into a 'match any' type of condition.

The interesting part here is that in order to preserve the existing functionality for the japanese language as well (where a work can be comprised of more tokens that are passed around via a PhraseQuery) I'm going to explicitly check and transform PhraseQuery tokens into a 'must all match' list of tokens.

Attachments

Issue Links

supercedes

JCR-3428 Partial search terms are no longer highlighted in the excerpts

Closed

Activity

People

Assignee:: Alex Deparvu

Reporter:: Alex Deparvu

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 07/Dec/12 09:39

Updated:: 06/May/13 11:20

Resolved:: 07/Dec/12 09:43