Lucene - Core
  1. Lucene - Core
  2. LUCENE-5375

ToChildBlockJoinQuery becomes crazy on wrong subquery

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.6
    • Fix Version/s: 4.6.1, 6.0
    • Component/s: modules/join
    • Labels:
    • Lucene Fields:
      New

      Description

      If user supplies wrong subquery to ToParentBlockJoinQuery it reasonably throws IllegalStateException. (http://lucene.apache.org/core/4_0_0/join/org/apache/lucene/search/join/ToParentBlockJoinQuery.html 'The child documents must be orthogonal to the parent documents: the wrapped child query must never return a parent document.'). However ToChildBlockJoinQuery just goes crazy silently. I want to provide simple patch for ToChildBlockJoinQuery with if-throw clause and test.

      See http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201311.mbox/%3CF415CE3A-EBE5-4D15-ADF1-C5EAD32A1EB2@sheffield.ac.uk%3E

      1. LUCENE-5375.patch
        9 kB
        Dr Oleg Savrasov
      2. SOLR-5553.patch
        8 kB
        Dr Oleg Savrasov
      3. SOLR-5553-1.patch
        13 kB
        Dr Oleg Savrasov
      4. SOLR-5553-insufficient_assertions.patch
        9 kB
        Dr Oleg Savrasov

        Activity

        Hide
        Dr Oleg Savrasov added a comment -

        Proposed patch is provided, see attach

        Show
        Dr Oleg Savrasov added a comment - Proposed patch is provided, see attach
        Hide
        Michael McCandless added a comment -

        This seems like a lucene issue; I'll rename it, and we should fix the tests to be Lucene-level tests...

        I agree it'd be useful to detect mis-use but I don't like the added cost this imposed on correct usage. I wonder if there's a more best-effort approach we could take.

        If you enable assertions, without adding the validateParents, do the tests trip the existing assertions?

        Show
        Michael McCandless added a comment - This seems like a lucene issue; I'll rename it, and we should fix the tests to be Lucene-level tests... I agree it'd be useful to detect mis-use but I don't like the added cost this imposed on correct usage. I wonder if there's a more best-effort approach we could take. If you enable assertions, without adding the validateParents, do the tests trip the existing assertions?
        Hide
        Dr Oleg Savrasov added a comment -

        Hi Michael,

        Many thanks for reviewing the patch. I agree that it's rather Lucene issue and should be covered by appropriate tests.

        I see your point about adding cost for correct usage. If I enable assertions without adding the validateParents, testAdvanceValidationForToChildBjq always fail, which means that there could be another way for query validation. Let me investigate it.

        Thank you,

        Dr Oleg Savrasov,
        Community Coordinator,
        Grid Dynamics Search team

        Show
        Dr Oleg Savrasov added a comment - Hi Michael, Many thanks for reviewing the patch. I agree that it's rather Lucene issue and should be covered by appropriate tests. I see your point about adding cost for correct usage. If I enable assertions without adding the validateParents, testAdvanceValidationForToChildBjq always fail, which means that there could be another way for query validation. Let me investigate it. Thank you, Dr Oleg Savrasov, Community Coordinator, Grid Dynamics Search team
        Hide
        Dr Oleg Savrasov added a comment -

        After some investigations I came to conclusion that existing assertions in ToChildBlockJoinScorer are not sufficient to guarantee child and parent documents orthogonality. In order to prove that, I've created a test which expects appropriate AssertionError, please see attached SOLR-5553-insufficient_assertions.patch. The test fails for example with -Dtests.seed=E8A0C61499EE8851:C5E7CB6721742C4F.

        I have not found any way for correct validation other than checking parentBits. It costs retrieving appropriate bit from FixedBitSet, but it seems not too expensive.

        The test is reworked in order to be Lucene-level and cover the cases when existing assertions are not sufficient. Please see attached SOLR-5553-1.patch.

        Show
        Dr Oleg Savrasov added a comment - After some investigations I came to conclusion that existing assertions in ToChildBlockJoinScorer are not sufficient to guarantee child and parent documents orthogonality. In order to prove that, I've created a test which expects appropriate AssertionError, please see attached SOLR-5553 -insufficient_assertions.patch. The test fails for example with -Dtests.seed=E8A0C61499EE8851:C5E7CB6721742C4F. I have not found any way for correct validation other than checking parentBits. It costs retrieving appropriate bit from FixedBitSet, but it seems not too expensive. The test is reworked in order to be Lucene-level and cover the cases when existing assertions are not sufficient. Please see attached SOLR-5553 -1.patch.
        Hide
        Dr Oleg Savrasov added a comment -

        Test that shows insufficient assertions in ToChildBlockJoinScorer. Expects AssertionError and fails randomly, for example with -Dtests.seed=E8A0C61499EE8851:C5E7CB6721742C4F

        Show
        Dr Oleg Savrasov added a comment - Test that shows insufficient assertions in ToChildBlockJoinScorer. Expects AssertionError and fails randomly, for example with -Dtests.seed=E8A0C61499EE8851:C5E7CB6721742C4F
        Hide
        Dr Oleg Savrasov added a comment -

        Reworked patch

        Show
        Dr Oleg Savrasov added a comment - Reworked patch
        Hide
        Dr Oleg Savrasov added a comment -

        Hi Michael,

        Have you had a chance to look at it? Is there any update?

        Thank you,
        Oleg

        Show
        Dr Oleg Savrasov added a comment - Hi Michael, Have you had a chance to look at it? Is there any update? Thank you, Oleg
        Hide
        ASF subversion and git services added a comment -

        Commit 1558334 from Michael McCandless in branch 'dev/trunk'
        [ https://svn.apache.org/r1558334 ]

        LUCENE-5375: ToChildBlockJoinQuery works harder to detect mis-use, where the parent query incorrectly returns child docs

        Show
        ASF subversion and git services added a comment - Commit 1558334 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1558334 ] LUCENE-5375 : ToChildBlockJoinQuery works harder to detect mis-use, where the parent query incorrectly returns child docs
        Hide
        ASF subversion and git services added a comment -

        Commit 1558335 from Michael McCandless in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1558335 ]

        LUCENE-5375: ToChildBlockJoinQuery works harder to detect mis-use, where the parent query incorrectly returns child docs

        Show
        ASF subversion and git services added a comment - Commit 1558335 from Michael McCandless in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1558335 ] LUCENE-5375 : ToChildBlockJoinQuery works harder to detect mis-use, where the parent query incorrectly returns child docs
        Hide
        ASF subversion and git services added a comment -

        Commit 1558336 from Michael McCandless in branch 'dev/branches/lucene_solr_4_6'
        [ https://svn.apache.org/r1558336 ]

        LUCENE-5375: ToChildBlockJoinQuery works harder to detect mis-use, where the parent query incorrectly returns child docs

        Show
        ASF subversion and git services added a comment - Commit 1558336 from Michael McCandless in branch 'dev/branches/lucene_solr_4_6' [ https://svn.apache.org/r1558336 ] LUCENE-5375 : ToChildBlockJoinQuery works harder to detect mis-use, where the parent query incorrectly returns child docs
        Hide
        Michael McCandless added a comment -

        Woop, sorry Oleg: this had fallen past the event horizon on my TODO list!

        Thank you, I just committed the last patch.

        Show
        Michael McCandless added a comment - Woop, sorry Oleg: this had fallen past the event horizon on my TODO list! Thank you, I just committed the last patch.
        Hide
        ASF subversion and git services added a comment -

        Commit 1558350 from Michael McCandless in branch 'dev/trunk'
        [ https://svn.apache.org/r1558350 ]

        LUCENE-5375: fix javadocs

        Show
        ASF subversion and git services added a comment - Commit 1558350 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1558350 ] LUCENE-5375 : fix javadocs
        Hide
        ASF subversion and git services added a comment -

        Commit 1558351 from Michael McCandless in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1558351 ]

        LUCENE-5375: fix javadocs

        Show
        ASF subversion and git services added a comment - Commit 1558351 from Michael McCandless in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1558351 ] LUCENE-5375 : fix javadocs
        Hide
        ASF subversion and git services added a comment -

        Commit 1558352 from Michael McCandless in branch 'dev/branches/lucene_solr_4_6'
        [ https://svn.apache.org/r1558352 ]

        LUCENE-5375: fix javadocs

        Show
        ASF subversion and git services added a comment - Commit 1558352 from Michael McCandless in branch 'dev/branches/lucene_solr_4_6' [ https://svn.apache.org/r1558352 ] LUCENE-5375 : fix javadocs
        Hide
        Dr Oleg Savrasov added a comment -

        Many thanks!

        Show
        Dr Oleg Savrasov added a comment - Many thanks!
        Hide
        Mikhail Khludnev added a comment -
        Caused by: java.lang.IllegalStateException: child query must only match non-parent docs, but parent docID=21474 matched childScorer=class org.apache.lucene.search.TermScorer
        

        I'd like to raise an issue to improve the message. I suppose we need to put children and parent filter queries into text to let users check the document in the intersection.

        Show
        Mikhail Khludnev added a comment - Caused by: java.lang.IllegalStateException: child query must only match non-parent docs, but parent docID=21474 matched childScorer=class org.apache.lucene.search.TermScorer I'd like to raise an issue to improve the message. I suppose we need to put children and parent filter queries into text to let users check the document in the intersection.
        Hide
        Dr Oleg Savrasov added a comment -

        Validation message improvement

        Show
        Dr Oleg Savrasov added a comment - Validation message improvement

          People

          • Assignee:
            Unassigned
            Reporter:
            Dr Oleg Savrasov
          • Votes:
            2 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 24h
              24h
              Remaining:
              Remaining Estimate - 24h
              24h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development