Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-8810

Flattening of nested disjunctions does not take into account number of clause limitation of builder

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 8.0
    • 8.2
    • core/search
    • None
    • New

    Description

      In org.apache.lucene.search.BooleanQuery, at the end of the function rewrite(IndexReader reader), the query is rewritten to flatten nested disjunctions.

      This does not take into account the limitation on the number of clauses in a builder (1024).
      In some circumstances, this limite can be reached, hence an exception is thrown.

      Here is a unit test that highlight this.

        public void testFlattenInnerDisjunctionsWithMoreThan1024Terms() throws IOException {
          IndexSearcher searcher = newSearcher(new MultiReader());
      
          BooleanQuery.Builder builder1024 = new BooleanQuery.Builder();
          for(int i = 0; i < 1024; i++) {
            builder1024.add(new TermQuery(new Term("foo", "bar-" + i)), Occur.SHOULD);
          }
          Query inner = builder1024.build();
          Query query = new BooleanQuery.Builder()
              .add(inner, Occur.SHOULD)
              .add(new TermQuery(new Term("foo", "baz")), Occur.SHOULD)
              .build();
          searcher.rewrite(query);
        }
      

      Attachments

        1. LUCENE-8810.patch
          5 kB
          Atri Sharma

        Activity

          People

            Unassigned Unassigned
            msauvee Mickaël Sauvée
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 20m
                1h 20m