Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-8810

Flattening of nested disjunctions does not take into account number of clause limitation of builder

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 8.0
    • Fix Version/s: 8.2
    • Component/s: core/search
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      In org.apache.lucene.search.BooleanQuery, at the end of the function rewrite(IndexReader reader), the query is rewritten to flatten nested disjunctions.

      This does not take into account the limitation on the number of clauses in a builder (1024).
      In some circumstances, this limite can be reached, hence an exception is thrown.

      Here is a unit test that highlight this.

        public void testFlattenInnerDisjunctionsWithMoreThan1024Terms() throws IOException {
          IndexSearcher searcher = newSearcher(new MultiReader());
      
          BooleanQuery.Builder builder1024 = new BooleanQuery.Builder();
          for(int i = 0; i < 1024; i++) {
            builder1024.add(new TermQuery(new Term("foo", "bar-" + i)), Occur.SHOULD);
          }
          Query inner = builder1024.build();
          Query query = new BooleanQuery.Builder()
              .add(inner, Occur.SHOULD)
              .add(new TermQuery(new Term("foo", "baz")), Occur.SHOULD)
              .build();
          searcher.rewrite(query);
        }
      

        Attachments

        1. LUCENE-8810.patch
          5 kB
          Atri Sharma

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              msauvee Mickaël Sauvée
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 20m
                1h 20m