Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-57

MultiSearcher does not work with MultiTermQuery or PrefixQuery

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.2
    • Fix Version/s: None
    • Component/s: core/search
    • Labels:
      None
    • Environment:

      Operating System: All
      Platform: Other

    • Bugzilla Id:
      12667

      Description

      The multiSearch class reuses Query objects. Unfortunately some Query sub-classes
      initialize the search with terms from the index that they are first used on.
      This means that subsequent uses on other indexes can result in missing hits.

      For example subclasses of MultiTermQuery pre "populate" their terms from the
      first index. If this query is used again then the terms are not reset so the
      query is in effect done over terms extracted from the 1st index and not the 2nd.

      So the query

      a*b

      when expanded using terms from the first index as

      a1b a2b a3b

      and it is this query that is performed multiple times.

      If a second index has the terms

      a1b a2b a3b a4b a5b

      then potential results are missing. The latter two terms are not considered in
      the search.

      This can easily be seen by "reversing" the order of Searcher objects in a
      MultiSearcher query. Different results can be displayed depending on the ordering.

      The solution would be to allow queries to be "reset" - thus removing the cache
      terms objects.

        Activity

        Hide
        otis@apache.org Otis Gospodnetic added a comment -

        Doug's today's commit indicates that this bug should now be fixed. Marking this
        as fixed before I forget to do it. Feel free to reopen if it is not.

        Show
        otis@apache.org Otis Gospodnetic added a comment - Doug's today's commit indicates that this bug should now be fixed. Marking this as fixed before I forget to do it. Feel free to reopen if it is not.
        Hide
        alt@picnic.demon.co.uk Andy Thomas added a comment -

        Created an attachment (id=3507)
        Sample class that shows the bug

        Show
        alt@picnic.demon.co.uk Andy Thomas added a comment - Created an attachment (id=3507) Sample class that shows the bug
        Hide
        otis@apache.org Otis Gospodnetic added a comment -

        I'd love to fix this if it is really a bug, but I can't duplicate this behaviour.
        Could you please attach a self-contained class that demonstrates this bug?
        Thanks.

        Show
        otis@apache.org Otis Gospodnetic added a comment - I'd love to fix this if it is really a bug, but I can't duplicate this behaviour. Could you please attach a self-contained class that demonstrates this bug? Thanks.
        Hide
        alt@picnic.demon.co.uk Andy Thomas added a comment -

        It seems that Queries are reset using the "prepare()" method on a Query. However
        the MultiTermQuery does not do this correctly. Its caches in the

        private BooleanQuery query;

        member item the results of the terms generated from the index. This value is
        never cleared in the prepare() method - hence further indexes are ignored.

        You cannot create a prepare() method in MultiTermQuery since its subclasses
        override
        the method and provide there own implementation.

        The solution I propose is to change setEnum in MultiTermQuery as follows:-

        /** Set the TermEnum to be used */
        protected void setEnum(FilteredTermEnum enum)

        { this.enum = enum; this.query = null; }

        This will clear the cache query out on each reset.

        I have tested this patch and it correctly produces the search results when using
        multiple indexes.

        Show
        alt@picnic.demon.co.uk Andy Thomas added a comment - It seems that Queries are reset using the "prepare()" method on a Query. However the MultiTermQuery does not do this correctly. Its caches in the private BooleanQuery query; member item the results of the terms generated from the index. This value is never cleared in the prepare() method - hence further indexes are ignored. You cannot create a prepare() method in MultiTermQuery since its subclasses override the method and provide there own implementation. The solution I propose is to change setEnum in MultiTermQuery as follows:- /** Set the TermEnum to be used */ protected void setEnum(FilteredTermEnum enum) { this.enum = enum; this.query = null; } This will clear the cache query out on each reset. I have tested this patch and it correctly produces the search results when using multiple indexes.

          People

          • Assignee:
            java-dev@lucene.apache.org Lucene Developers
            Reporter:
            alt@picnic.demon.co.uk Andy Thomas
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development