Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-4076

When doing nested (index-time) joins, ToParentBlockJoinCollector delivers incomplete information on the grand-children

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 3.4, 3.5, 3.6, 4.7.1
    • Fix Version/s: None
    • Component/s: modules/join
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      ToParentBlockJoinCollector.getTopGroups does not provide the correct answer when a query with nested ToParentBlockJoinCollectors is performed.

      Given the following example query:

      Query grandChildQuery=new TermQuery(new Term("color", "red"));
      Filter childFilter = new CachingWrapperFilter(new RawTermFilter(new Term("type","child")), DeletesMode.IGNORE);
      ToParentBlockJoinQuery grandchildJoinQuery = new ToParentBlockJoinQuery(grandChildQuery, childFilter, ScoreMode.Max);
      
      BooleanQuery childQuery= new BooleanQuery();
      childQuery.add(grandchildJoinQuery, Occur.MUST);
      childQuery.add(new TermQuery(new Term("shape", "round")), Occur.MUST);
      
      Filter parentFilter = new CachingWrapperFilter(new RawTermFilter(new Term("type","parent")), DeletesMode.IGNORE);
      ToParentBlockJoinQuery childJoinQuery = new ToParentBlockJoinQuery(childQuery, parentFilter, ScoreMode.Max);
      
      parentQuery=new BooleanQuery();
      parentQuery.add(childJoinQuery, Occur.MUST);
      parentQuery.add(new TermQuery(new Term("name", "test")), Occur.MUST);
      
      ToParentBlockJoinCollector parentCollector= new ToParentBlockJoinCollector(Sort.RELEVANCE, 30, true, true);
      searcher.search(parentQuery, null, parentCollector);
      

      This produces the correct results:

      TopGroups<Integer> childGroups = parentCollector.getTopGroups(childJoinQuery, null, 0, 20, 0, false); 
      

      However, this does not:

      TopGroups<Integer> grandChildGroups = parentCollector.getTopGroups(grandchildJoinQuery, null, 0, 20, 0, false); 
      

      The content of grandChildGroups is broken in the following ways:

      • The groupValue is not the document id of the child document (which is the parent of a grandchild document), but the document id of the previous matching parent document
      • There are only as much GroupDocs as there are parent documents (not child documents), and they only contain the children of the last child document (but, as mentioned before, with the wrong groupValue).

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              christophk Christoph Kaser
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: