Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-4009

Search done via Lucene index might return duplicate results

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 1.3.0
    • 1.2.14, 1.3.16, 1.4
    • lucene
    • None

    Description

      In certain cases (like one explained in OAK-2569) it can happen that Lucene index version gets updated in between a cursor traversal. Then the offset used would be wrong and currently the LucenePropertyIndex would run the query without offset which can result in duplicate rows.

      As that time we relied on seenPaths for ensuring that duplicate paths would be detected. However a closer look at code flow now indicates that it gets used only for those queries where we do path transformation [1] and not in all cases.

      Also earlier we did used unique cursor but we changed that in OAK-2754 in favour of seenPaths. So as a fix we should revert that back and use unique cursor there

      [1] search on jcr:conent/@title and actual search done for 'title' and path checked to end with jcr:content

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            chetanm Chetan Mehrotra
            chetanm Chetan Mehrotra
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment