Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-35

[PATCH] AND match fails if any Term is filtered out by an analyser.

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: core/search
    • Labels:
      None
    • Environment:

      Operating System: other
      Platform: Other

    • Bugzilla Id:
      9110

      Description

      If i do an AND search with an StandardAnalyzer the word 'it' (or better IT
      for infromation technology ) is cut out by the Analyzer. That is ok. I get no
      result. But wenn i search for 'it' and 'plus' (plus is not cut out by the
      analyzer). The result is empty, too. But that is not fine, of cause if i search
      only for 'plus' i get an result.
      So, i think if an word is thrown away by the analyzer, this part of the and
      query should hava no affect to the rest of the search. It should left out by
      the BooleanQuery.
      I hope it is easy to fix, because it has much affect of search result.
      (I tryed do left out any Analyzer but that wasnt suitable ... sorry for my bad
      english)

        Activity

        Hide
        mlambert@aseedge.com Michael Lambert added a comment -

        I agree. This is a big show stopper. Also happens if you have a double-quoted
        set of terms (phrase.)

        IE:
        Apache AND Cool
        returns 100 results
        "Apache is cool"
        returns 0 results

        Not good.

        Show
        mlambert@aseedge.com Michael Lambert added a comment - I agree. This is a big show stopper. Also happens if you have a double-quoted set of terms (phrase.) IE: Apache AND Cool returns 100 results "Apache is cool" returns 0 results Not good.
        Hide
        otis@apache.org Otis Gospodnetic added a comment -
            • Bug 19149 has been marked as a duplicate of this bug. ***
        Show
        otis@apache.org Otis Gospodnetic added a comment - Bug 19149 has been marked as a duplicate of this bug. ***
        Hide
        otis@apache.org Otis Gospodnetic added a comment -
            • Bug 7088 has been marked as a duplicate of this bug. ***
        Show
        otis@apache.org Otis Gospodnetic added a comment - Bug 7088 has been marked as a duplicate of this bug. ***
        Hide
        daniel.naber@t-online.de Daniel Naber added a comment -

        Created an attachment (id=9278)
        test case that shows the phrase/stopword problem is okay now

        Show
        daniel.naber@t-online.de Daniel Naber added a comment - Created an attachment (id=9278) test case that shows the phrase/stopword problem is okay now
        Hide
        daniel.naber@t-online.de Daniel Naber added a comment -

        I attached a test case which, I think, shows that the "apache is cool" query works okay now
        (tested with 1.3RC2). However, the problem described in #19149 persists. Unless I got
        something wrong, this bug could be closed and #19149 should be re-opened (unlike this bug it
        contains a code fragment which makes the problem easy to check).

        Show
        daniel.naber@t-online.de Daniel Naber added a comment - I attached a test case which, I think, shows that the "apache is cool" query works okay now (tested with 1.3RC2). However, the problem described in #19149 persists. Unless I got something wrong, this bug could be closed and #19149 should be re-opened (unlike this bug it contains a code fragment which makes the problem easy to check).
        Hide
        morus.walter@gmx.de Morus Walter added a comment -

        The attached patch fixes the ArrayIndexOutOfBoundsExceptions if stop words
        occur.
        It also checks, that a BooleanQuery is not empty, that is that it contains at
        least one not prohibited clause. For empty queries (including queries containing
        only prohibited clauses) null is returned.
        Note that this may also drop subqueries, e.g.
        `a (-b -c)' get parsed as `a' since '-b -c' cannot be searched.

        Show
        morus.walter@gmx.de Morus Walter added a comment - The attached patch fixes the ArrayIndexOutOfBoundsExceptions if stop words occur. It also checks, that a BooleanQuery is not empty, that is that it contains at least one not prohibited clause. For empty queries (including queries containing only prohibited clauses) null is returned. Note that this may also drop subqueries, e.g. `a (-b -c)' get parsed as `a' since '-b -c' cannot be searched.
        Hide
        morus.walter@gmx.de Morus Walter added a comment -

        Created an attachment (id=10052)
        patch to fix ArrayIndexOutOfBoundsExceptions in query parser

        Show
        morus.walter@gmx.de Morus Walter added a comment - Created an attachment (id=10052) patch to fix ArrayIndexOutOfBoundsExceptions in query parser
        Hide
        morus.walter@gmx.de Morus Walter added a comment -

        I forgot to say, that the patch is obsolete, if the new version of my patch
        for Bug #25820 is accepted.

        Show
        morus.walter@gmx.de Morus Walter added a comment - I forgot to say, that the patch is obsolete, if the new version of my patch for Bug #25820 is accepted.
        Hide
        goller@detego-software.de Christoph Goller added a comment -

        I applied the part of Morus' patch that fixes ArrayIndexOutOfBoundsExceptions
        since it does not change QueryParser's behaviour, only fixes a bug if stopwords
        occur at the beginning of a boolean query.

        Show
        goller@detego-software.de Christoph Goller added a comment - I applied the part of Morus' patch that fixes ArrayIndexOutOfBoundsExceptions since it does not change QueryParser's behaviour, only fixes a bug if stopwords occur at the beginning of a boolean query.
        Hide
        goller@detego-software.de Christoph Goller added a comment -

        I think this bug can be closed now, since the important aspects are
        better described in bugs #25820 and #7574.

            • This bug has been marked as a duplicate of 7574 ***
        Show
        goller@detego-software.de Christoph Goller added a comment - I think this bug can be closed now, since the important aspects are better described in bugs #25820 and #7574. This bug has been marked as a duplicate of 7574 ***

          People

          • Assignee:
            java-dev@lucene.apache.org Lucene Developers
            Reporter:
            m.wagner@blue-orange.de Michael Wagner
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development