[OAK-4816] Property index: cost estimate with path restriction is too optimistic - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.5.12, 1.6.0
Component/s: query
Labels:
None

Description

The property index cost estimation is too optimistic in case there is a property restriction plus a path restriction. The current algorithm, as documented in http://jackrabbit.apache.org/oak/docs/query/property-index.html#Cost_Estimation , assumes that matching entries are evenly distributed over the whole repository. In many cases, this is not the case. In extreme cases, all entries that match the property restriction are in the subtree that matches the path restriction. Example:

10'000 nodes with property color "red".
1 million nodes in the repository
10'000 nodes in the subtree /content
query /jcr:root/content//*[@color = 'red']

Currently, the cost estimate is about 100, there are about 10'000 entries for "red", and "/content" contains 1% of all nodes. But in reality, there might be 10'000 entries with color "red" in that subtree (that is, all of them).

The cost estimation should take that into account, and assume that at least 80% of the matching nodes are in that subtree (if the subtree contains that many nodes).

Attachments

Activity

People

Assignee:: Thomas Mueller

Reporter:: Thomas Mueller

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 16/Sep/16 12:14

Updated:: 27/Apr/17 09:52

Resolved:: 05/Oct/16 14:12