Lucene - Core
  1. Lucene - Core
  2. LUCENE-5520

ArrayIndexOutOfBoundException in ToChildBlockJoinQuery when there's a deleted parent without any children

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.2, 4.7
    • Fix Version/s: 4.7.1, 4.8, 6.0
    • Component/s: modules/join
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      This problem is found in lucene 4.2.0 and reproduced in 4.7.0

      In our app when we delete a document we always delete all the children.
      But not all parents have children. The exception happen for us when the parent without children is deleted.

      1. LUCENE-5220.patch
        3 kB
        Michael McCandless
      2. non working patch.patch
        0.7 kB
        Sally Ang
      3. TestBlockJoin.patch
        2 kB
        Sally Ang
      4. testout.txt
        591 kB
        Sally Ang
      5. working patch.patch
        0.6 kB
        Sally Ang

        Activity

        Hide
        Sally Ang added a comment -

        I've attached a patch for TestBlockJoin.java in file TestBlockJoin.patch
        Without any modification it reproduces the error that happened on our application.

        Attached is another patch for ToChildBlockJoinQuery.java that passes all the original test and my additional test 'working patch.patch'
        What this patch did is just add another check so that if the parent doesn't have any child we continue to the next parent.
        if (acceptDocs != null && !acceptDocs.get(childDoc)) {
        if (childDoc < parentDoc)

        { continue nextChildDoc; }

        else

        { continue; }

        }

        What I'm confused is with this other patch 'non working patch.patch', it passes all the test including my test except testRandom.
        Here what I do is before checking for deleted childDoc I checked if the parent is deleted, if it is deleted we just go to the next parent.
        if (acceptDocs != null) {
        System.out.println("parent doc " + parentDoc + " is alive: " + acceptDocs.get(parentDoc));
        if (!acceptDocs.get(parentDoc))

        { continue; }

        }

        childDoc = 1 + parentBits.prevSetBit(parentDoc-1);

        if (acceptDocs != null && !acceptDocs.get(childDoc))

        { continue nextChildDoc; }

        attached in 'testout.txt' is the output of testRandom
        These lines show we have 6 deleted parents
        [junit4:junit4] 1> DELETE parentID=40
        [junit4:junit4] 1> DELETE parentID=49
        [junit4:junit4] 1> DELETE parentID=54
        [junit4:junit4] 1> DELETE parentID=77
        [junit4:junit4] 1> DELETE parentID=102
        [junit4:junit4] 1> DELETE parentID=104

        But somehow there are more parent doc that has their acceptDocs bit set to false.
        [junit4:junit4] 1> parent doc 29 is alive: false
        [junit4:junit4] 1> parent doc 40 is alive: false
        [junit4:junit4] 1> parent doc 56 is alive: false
        [junit4:junit4] 1> parent doc 95 is alive: false
        [junit4:junit4] 1> parent doc 99 is alive: false
        [junit4:junit4] 1> parent doc 122 is alive: false
        [junit4:junit4] 1> parent doc 141 is alive: false
        [junit4:junit4] 1> parent doc 150 is alive: false
        [junit4:junit4] 1> parent doc 2 is alive: false
        [junit4:junit4] 1> parent doc 20 is alive: false
        [junit4:junit4] 1> parent doc 38 is alive: false
        [junit4:junit4] 1> parent doc 87 is alive: false
        [junit4:junit4] 1> parent doc 43 is alive: false
        [junit4:junit4] 1> parent doc 59 is alive: false
        [junit4:junit4] 1> parent doc 78 is alive: false
        [junit4:junit4] 1> parent doc 82 is alive: false

        Is it right for me to assume acceptDocs.get(docId) return true if the docId is not deleted and false if it is deleted?
        Can we use the first working patch?

        Show
        Sally Ang added a comment - I've attached a patch for TestBlockJoin.java in file TestBlockJoin.patch Without any modification it reproduces the error that happened on our application. Attached is another patch for ToChildBlockJoinQuery.java that passes all the original test and my additional test 'working patch.patch' What this patch did is just add another check so that if the parent doesn't have any child we continue to the next parent. if (acceptDocs != null && !acceptDocs.get(childDoc)) { if (childDoc < parentDoc) { continue nextChildDoc; } else { continue; } } What I'm confused is with this other patch 'non working patch.patch', it passes all the test including my test except testRandom. Here what I do is before checking for deleted childDoc I checked if the parent is deleted, if it is deleted we just go to the next parent. if (acceptDocs != null) { System.out.println("parent doc " + parentDoc + " is alive: " + acceptDocs.get(parentDoc)); if (!acceptDocs.get(parentDoc)) { continue; } } childDoc = 1 + parentBits.prevSetBit(parentDoc-1); if (acceptDocs != null && !acceptDocs.get(childDoc)) { continue nextChildDoc; } attached in 'testout.txt' is the output of testRandom These lines show we have 6 deleted parents [junit4:junit4] 1> DELETE parentID=40 [junit4:junit4] 1> DELETE parentID=49 [junit4:junit4] 1> DELETE parentID=54 [junit4:junit4] 1> DELETE parentID=77 [junit4:junit4] 1> DELETE parentID=102 [junit4:junit4] 1> DELETE parentID=104 But somehow there are more parent doc that has their acceptDocs bit set to false. [junit4:junit4] 1> parent doc 29 is alive: false [junit4:junit4] 1> parent doc 40 is alive: false [junit4:junit4] 1> parent doc 56 is alive: false [junit4:junit4] 1> parent doc 95 is alive: false [junit4:junit4] 1> parent doc 99 is alive: false [junit4:junit4] 1> parent doc 122 is alive: false [junit4:junit4] 1> parent doc 141 is alive: false [junit4:junit4] 1> parent doc 150 is alive: false [junit4:junit4] 1> parent doc 2 is alive: false [junit4:junit4] 1> parent doc 20 is alive: false [junit4:junit4] 1> parent doc 38 is alive: false [junit4:junit4] 1> parent doc 87 is alive: false [junit4:junit4] 1> parent doc 43 is alive: false [junit4:junit4] 1> parent doc 59 is alive: false [junit4:junit4] 1> parent doc 78 is alive: false [junit4:junit4] 1> parent doc 82 is alive: false Is it right for me to assume acceptDocs.get(docId) return true if the docId is not deleted and false if it is deleted? Can we use the first working patch?
        Hide
        Michael McCandless added a comment -

        Unfortunately, we are not allowed to check acceptDocs with parent docIDs: that bitset is only "valid" for child documents. This is because the primary search is against children, and IndexSearcher could pass a Filter "down low" as the acceptDocs.

        This also means that your app really must delete all child documents for a given parent, if you never want to see that parent; but really it's best to delete parent + all children, whenever you want to delete.

        I have an idea for a possible fix ... I'll test and post a patch.

        Show
        Michael McCandless added a comment - Unfortunately, we are not allowed to check acceptDocs with parent docIDs: that bitset is only "valid" for child documents. This is because the primary search is against children, and IndexSearcher could pass a Filter "down low" as the acceptDocs. This also means that your app really must delete all child documents for a given parent, if you never want to see that parent; but really it's best to delete parent + all children, whenever you want to delete. I have an idea for a possible fix ... I'll test and post a patch.
        Hide
        Michael McCandless added a comment -

        Sally, could you try this patch?

        I added a check, when we try to jump to the first child for a given doc, to detect the case when that parent has 0 child docs, and then continue in the parent loop if so.

        Show
        Michael McCandless added a comment - Sally, could you try this patch? I added a check, when we try to jump to the first child for a given doc, to detect the case when that parent has 0 child docs, and then continue in the parent loop if so.
        Hide
        Sally Ang added a comment -

        I've tried the patch. It work for us.

        Show
        Sally Ang added a comment - I've tried the patch. It work for us.
        Hide
        Michael McCandless added a comment -

        Thanks Sally, I'll commit this fix.

        Show
        Michael McCandless added a comment - Thanks Sally, I'll commit this fix.
        Hide
        ASF subversion and git services added a comment -

        Commit 1577076 from Michael McCandless in branch 'dev/trunk'
        [ https://svn.apache.org/r1577076 ]

        LUCENE-5520: fix AIOOBE from ToChildBlockJoinQuery when a parent has no children

        Show
        ASF subversion and git services added a comment - Commit 1577076 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1577076 ] LUCENE-5520 : fix AIOOBE from ToChildBlockJoinQuery when a parent has no children
        Hide
        ASF subversion and git services added a comment -

        Commit 1577078 from Michael McCandless in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1577078 ]

        LUCENE-5520: fix AIOOBE from ToChildBlockJoinQuery when a parent has no children

        Show
        ASF subversion and git services added a comment - Commit 1577078 from Michael McCandless in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1577078 ] LUCENE-5520 : fix AIOOBE from ToChildBlockJoinQuery when a parent has no children
        Hide
        Michael McCandless added a comment -

        Thanks Sally!

        Show
        Michael McCandless added a comment - Thanks Sally!
        Hide
        Robert Muir added a comment -

        reopening for 4.7.1 backport

        Show
        Robert Muir added a comment - reopening for 4.7.1 backport
        Hide
        ASF subversion and git services added a comment -

        Commit 1578525 from Robert Muir in branch 'dev/branches/lucene_solr_4_7'
        [ https://svn.apache.org/r1578525 ]

        LUCENE-5520: fix AIOOBE from ToChildBlockJoinQuery when a parent has no children

        Show
        ASF subversion and git services added a comment - Commit 1578525 from Robert Muir in branch 'dev/branches/lucene_solr_4_7' [ https://svn.apache.org/r1578525 ] LUCENE-5520 : fix AIOOBE from ToChildBlockJoinQuery when a parent has no children
        Hide
        Steve Rowe added a comment -

        Bulk close 4.7.1 issues

        Show
        Steve Rowe added a comment - Bulk close 4.7.1 issues

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Sally Ang
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development