Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-306

[PATCH]multiple wildcards ? at the end of search pattern return incorrect hits

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.4
    • Fix Version/s: None
    • Component/s: core/search
    • Labels:
      None
    • Environment:

      Operating System: other
      Platform: Other

    • Bugzilla Id:
      32167

      Description

      The problem is if you search on "ca??", the hit includes 'cat', 'CA',
      etc, while the user only wants 4 letter words start with CA, such as
      'card', 'cash', to be returned. This happens only when multiple '?' at
      the end of search pattern. The solution is to check if the word that is
      matching against search pattern ends while there is still '?' left. If
      this is the case, match should return false.

      Attached is the patch code I generated use 'diff'
      ********************************************************************

      — WildcardTermEnum.org 2004-05-11 11:42:10.000000000 -0400
      +++ WildcardTermEnum.java 2004-11-08 14:35:14.823610500 -0500
      @@ -132,6 +132,10 @@
      }
      else
      {
      + //to prevent "cat" matches "ca??"
      + if(wildchar == WILDCARD_CHAR)

      { + return false; + }


      // Look at the next character
      wildcardSearchPos++;
      }
      **********************************************************************

        Activity

        Hide
        xiaozheng.ma@redwood.com Xiaozheng Ma added a comment -

        for unit test:

        In testQuestionmark of TestWildcardQuery.java, The original unit test assetions
        (for query 2 and query3) should change to pass the test. Since previously
        the 'metal' matches 'metal?', but it should not.

        changes for quert2's assertion:
        assertMatches(searcher, query2, 1);
        // note that the number changes to
        // 1 since 'metal' is not a match any more.
        The same modification to query3's assertion is :
        assertMatches(searcher, query3, 0);
        //change to 0 since there is no match

        Erik has suggest a new unit test:

        Query query6 = new WildcardQuery(new Term("body", "metal??"));
        assertMatches(searcher, query6, 0);

        After I review this bug carefully, I realize that the bug is not only for
        multiply '?', it is for trail "?"-- After all "?" is not a "*"

        Show
        xiaozheng.ma@redwood.com Xiaozheng Ma added a comment - for unit test: In testQuestionmark of TestWildcardQuery.java, The original unit test assetions (for query 2 and query3) should change to pass the test. Since previously the 'metal' matches 'metal?', but it should not. changes for quert2's assertion: assertMatches(searcher, query2, 1); // note that the number changes to // 1 since 'metal' is not a match any more. The same modification to query3's assertion is : assertMatches(searcher, query3, 0); //change to 0 since there is no match Erik has suggest a new unit test: Query query6 = new WildcardQuery(new Term("body", "metal??")); assertMatches(searcher, query6, 0); After I review this bug carefully, I realize that the bug is not only for multiply '?', it is for trail "?"-- After all "?" is not a "*"
        Hide
        bernhard.messer@intrafind.de Bernhard Messer added a comment -

        WildcardQuery doesn't match 'cat' for queries like 'ca??' anylonger.

        Show
        bernhard.messer@intrafind.de Bernhard Messer added a comment - WildcardQuery doesn't match 'cat' for queries like 'ca??' anylonger.

          People

          • Assignee:
            Unassigned
            Reporter:
            xiaozheng.ma@redwood.com Xiaozheng Ma
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development