Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7747

QueryBuilder should build side-paths query (graph query) lazily

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 7.0, 6.5
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      In LUCENE-7638 we generate a query for each multi-token path in the graph and combine them at the end in a boolean query.
      This can lead to OOM when the number of path is big, instead we should build the disjunction of these paths lazily in order to throw "too many clauses" early if the number of paths is bigger than max boolean clauses.
      For instance a shingle filter with shingles of different size produces a graph with multiple side paths at each position. If the input query has a lot of tokens, the number of paths (query) created is exponential. For this use case it is maybe preferable to disallow graph query analysis completely but when allowed we should also be protected against combinatorial explosion.

        Activity

        Hide
        jim.ferenczi Jim Ferenczi added a comment - - edited

        Here is a patch that builds a lazy iterator over the different paths. The graph boolean query is built by consuming this iterator and throws a "too many clauses" exception when the number of paths is greater than the max number of clauses allowed.

        Show
        jim.ferenczi Jim Ferenczi added a comment - - edited Here is a patch that builds a lazy iterator over the different paths. The graph boolean query is built by consuming this iterator and throws a "too many clauses" exception when the number of paths is greater than the max number of clauses allowed.
        Hide
        mikemccand Michael McCandless added a comment -

        +1, thanks Jim Ferenczi

        Show
        mikemccand Michael McCandless added a comment - +1, thanks Jim Ferenczi
        Hide
        jim.ferenczi Jim Ferenczi added a comment -
        Show
        jim.ferenczi Jim Ferenczi added a comment - Thanks Michael McCandless !
        Show
        hossman Hoss Man added a comment - typo in the commit messages so they weren't auto picked up by gitbot... master: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/3ca4d800 branch_6x: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/11049ca7 branch_6_5: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/b462892a

          People

          • Assignee:
            Unassigned
            Reporter:
            jim.ferenczi Jim Ferenczi
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development