Lucene - Core
  1. Lucene - Core
  2. LUCENE-6984

MultiTermQuery mutability can cause assertion failures in BooleanQuery

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.5, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      BooleanQuery caches its hashcode on the grounds that it is immutable. However, this immutability doesn't hold if some of its clauses hold queries that are not themselves immutable - for example, a MultiTermQuery with a changeable rewrite method. If one of these clauses is mutated after the hashcode has been calculated, then the next time the parent BooleanQuery is used the assertion in BooleanQuery.hashCode() will fail.

      1. LUCENE-6984.patch
        5 kB
        Alan Woodward

        Activity

        Hide
        Alan Woodward added a comment -

        I found this through the highlighting code in luwak, where we rewrite queries into their Span equivalents. MultiTermQueries are wrapped in SpanMultiTermQueryWrapper, which alters the inner query's RewriteMethod. If the MTQ is within a deeply-nested BooleanQuery then this invalidates the query's hashcode, and the assertion trips the next time the query is used.

        I can see a few ways to solve this:

        • make MTQ immutable, and rework SpanMTQWrapper so that it uses its inner query's TermsEnum to rewrite itself rather than altering the inner query's rewrite method directly
        • make MTQ cloneable, and use a copy of the query in SpanMTQWrapper rather than altering the query directly
        • stop caching the hashcode on BooleanQuery

        I think we probably need to stop the caching anyway, because there's no way to guarantee that any query passed to BQ is itself immutable, but it's also worth making our core queries immutable if we can.

        Show
        Alan Woodward added a comment - I found this through the highlighting code in luwak, where we rewrite queries into their Span equivalents. MultiTermQueries are wrapped in SpanMultiTermQueryWrapper, which alters the inner query's RewriteMethod. If the MTQ is within a deeply-nested BooleanQuery then this invalidates the query's hashcode, and the assertion trips the next time the query is used. I can see a few ways to solve this: make MTQ immutable, and rework SpanMTQWrapper so that it uses its inner query's TermsEnum to rewrite itself rather than altering the inner query's rewrite method directly make MTQ cloneable, and use a copy of the query in SpanMTQWrapper rather than altering the query directly stop caching the hashcode on BooleanQuery I think we probably need to stop the caching anyway, because there's no way to guarantee that any query passed to BQ is itself immutable, but it's also worth making our core queries immutable if we can.
        Hide
        Adrien Grand added a comment -

        The way I see it, queries should be considered immutable once created, so it is a bug that SpanMTQWrapper modifies the query that is passed in its constructor. Removing the hashcode cache could help fix the bug that you hit but then I am a bit afraid that this would still be an issue for simple search use-cases because of filter caching. Maybe a way to fix it would be for SpanMTQWrapper to have its own MTQ wrapper that allows to override the rewrite method of the query that is provided?

        Show
        Adrien Grand added a comment - The way I see it, queries should be considered immutable once created, so it is a bug that SpanMTQWrapper modifies the query that is passed in its constructor. Removing the hashcode cache could help fix the bug that you hit but then I am a bit afraid that this would still be an issue for simple search use-cases because of filter caching. Maybe a way to fix it would be for SpanMTQWrapper to have its own MTQ wrapper that allows to override the rewrite method of the query that is provided?
        Hide
        Alan Woodward added a comment -

        Here's a patch that changes SpanMTQWrapper to have its own RewriteMethod, and to use that for rewriting rather than delegating down to the wrapped query.

        I'll open a separate issue to make MTQ and SpanMTQWrapper immutable in trunk.

        Show
        Alan Woodward added a comment - Here's a patch that changes SpanMTQWrapper to have its own RewriteMethod, and to use that for rewriting rather than delegating down to the wrapped query. I'll open a separate issue to make MTQ and SpanMTQWrapper immutable in trunk.
        Hide
        Adrien Grand added a comment -

        +1 to the patch, it is much better than my previous suggestion, and +1 to see if we can make (Span)MTQ immutable

        Show
        Adrien Grand added a comment - +1 to the patch, it is much better than my previous suggestion, and +1 to see if we can make (Span)MTQ immutable
        Hide
        ASF subversion and git services added a comment -

        Commit 1725719 from Alan Woodward in branch 'dev/trunk'
        [ https://svn.apache.org/r1725719 ]

        LUCENE-6984: SpanMultiTermQueryWrapper no longer modifies its wrapped query

        Show
        ASF subversion and git services added a comment - Commit 1725719 from Alan Woodward in branch 'dev/trunk' [ https://svn.apache.org/r1725719 ] LUCENE-6984 : SpanMultiTermQueryWrapper no longer modifies its wrapped query
        Hide
        ASF subversion and git services added a comment -

        Commit 1725724 from Alan Woodward in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1725724 ]

        LUCENE-6984: SpanMultiTermQueryWrapper no longer modifies its wrapped query

        Show
        ASF subversion and git services added a comment - Commit 1725724 from Alan Woodward in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1725724 ] LUCENE-6984 : SpanMultiTermQueryWrapper no longer modifies its wrapped query
        Hide
        Alan Woodward added a comment -

        Thanks for the review Adrien. I'll open a new issue to deal with immutability.

        Show
        Alan Woodward added a comment - Thanks for the review Adrien. I'll open a new issue to deal with immutability.
        Hide
        ASF subversion and git services added a comment -

        Commit 1725743 from Alan Woodward in branch 'dev/trunk'
        [ https://svn.apache.org/r1725743 ]

        LUCENE-6984: Remove out-of-date javadoc note

        Show
        ASF subversion and git services added a comment - Commit 1725743 from Alan Woodward in branch 'dev/trunk' [ https://svn.apache.org/r1725743 ] LUCENE-6984 : Remove out-of-date javadoc note
        Hide
        ASF subversion and git services added a comment -

        Commit 1725744 from Alan Woodward in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1725744 ]

        LUCENE-6984: Remove out-of-date javadoc note

        Show
        ASF subversion and git services added a comment - Commit 1725744 from Alan Woodward in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1725744 ] LUCENE-6984 : Remove out-of-date javadoc note

          People

          • Assignee:
            Unassigned
            Reporter:
            Alan Woodward
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development