Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-813

leading wildcard's don't work with trailing wildcard

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.2
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      Patch Available

      Description

      As reported by Antony Bowesman, leading wildcards don't work when there is a trailing wildcard character – instead a PrefixQuery is constructed.

      http://www.nabble.com/QueryParser-bug--tf3270956.html

      1. 813.fix.lead.wildcard.patch
        29 kB
        Doron Cohen
      2. qp-leading-wildcard.patch
        0.9 kB
        Hoss Man

        Issue Links

          Activity

          Hide
          hossman Hoss Man added a comment -

          patch demonstrating problem in testcase

          Show
          hossman Hoss Man added a comment - patch demonstrating problem in testcase
          Hide
          doronc Doron Cohen added a comment -

          I fixed the PREFIX definition in QueryParser.jj, changed:
          (<_TERM_START_CHAR> | "") (<_TERM_CHAR>) "*" >
          to
          ("") | ( <_TERM_START_CHAR> (<_TERM_CHAR>) "*" ) >
          which I think is more correct, and this solved the problem.

          However this caused another parsing test (testSimple) to fail - aparently from other reasons: the standard analyzer used in that test is lower casing the query tokens, which have umlauts - and this fails, because the lower casing is done char by char (Character.lowercase). I think that the wrong definition of PREFIXTERM was masking this behavior before. I am not sure yet if this is a bug, but for testSimple to pass for now I would modify the test to use a non lowercasing analyzer when umlauts are present. Please comment if you think this is a bug.

          Another tehnical issue that came up is line endings - compiling on XP, using cygwin, the javacc result files had wrong line endings.
          Fixed that with
          perl -p -e 's/(\r\n|\n|\r)/\n/g' QueryParser.java_fromJavacc > QueryParser,java
          Do others have this problem?
          Is there is a standard solution for this (other than installing *Nix?)
          If not I may look into allowing to fix this by build.xml.

          Will have a patch with the fix later today.

          Show
          doronc Doron Cohen added a comment - I fixed the PREFIX definition in QueryParser.jj, changed: (<_TERM_START_CHAR> | " ") (<_TERM_CHAR>) "*" > to (" ") | ( <_TERM_START_CHAR> (<_TERM_CHAR>) "*" ) > which I think is more correct, and this solved the problem. However this caused another parsing test (testSimple) to fail - aparently from other reasons: the standard analyzer used in that test is lower casing the query tokens, which have umlauts - and this fails, because the lower casing is done char by char (Character.lowercase). I think that the wrong definition of PREFIXTERM was masking this behavior before. I am not sure yet if this is a bug, but for testSimple to pass for now I would modify the test to use a non lowercasing analyzer when umlauts are present. Please comment if you think this is a bug. Another tehnical issue that came up is line endings - compiling on XP, using cygwin, the javacc result files had wrong line endings. Fixed that with perl -p -e 's/(\r\n|\n|\r)/\n/g' QueryParser.java_fromJavacc > QueryParser,java Do others have this problem? Is there is a standard solution for this (other than installing *Nix?) If not I may look into allowing to fix this by build.xml. Will have a patch with the fix later today.
          Hide
          michaelbusch Michael Busch added a comment -

          > Another tehnical issue that came up is line endings - compiling on XP, using cygwin, the javacc result files had wrong line endings.

          I have the same problem. I'm compiling on Win XP too, using eclipse and the javacc plugin. That used to work fine for me, but now it doesn't anymore, the result files have inconsistent line endings.

          Show
          michaelbusch Michael Busch added a comment - > Another tehnical issue that came up is line endings - compiling on XP, using cygwin, the javacc result files had wrong line endings. I have the same problem. I'm compiling on Win XP too, using eclipse and the javacc plugin. That used to work fine for me, but now it doesn't anymore, the result files have inconsistent line endings.
          Hide
          doronc Doron Cohen added a comment -

          Attached 813.fix.lead.wildcard.patch fixes this by modifying the definition of PREFIXTERM.
          The patch includes the test added by Host in the previous patch.
          Also, added to QP.jj 2 jdocs comments that were only in QP.java.
          I added a few WildCard tests.
          All tests pass.

          Doron

          Show
          doronc Doron Cohen added a comment - Attached 813.fix.lead.wildcard.patch fixes this by modifying the definition of PREFIXTERM. The patch includes the test added by Host in the previous patch. Also, added to QP.jj 2 jdocs comments that were only in QP.java. I added a few WildCard tests. All tests pass. Doron
          Hide
          doronc Doron Cohen added a comment -

          Reattaching (forgot to grant the license in the first shot...)

          Show
          doronc Doron Cohen added a comment - Reattaching (forgot to grant the license in the first shot...)
          Hide
          doronc Doron Cohen added a comment -

          I added a fix for this in build.xml -
          http://issues.apache.org/jira/browse/LUCENE-814
          (only for the QueryParser for now.)
          Could you try it?

          Thanks,
          Doron

          Show
          doronc Doron Cohen added a comment - I added a fix for this in build.xml - http://issues.apache.org/jira/browse/LUCENE-814 (only for the QueryParser for now.) Could you try it? Thanks, Doron
          Hide
          doronc Doron Cohen added a comment -

          Now working for Antony -
          http://www.nabble.com/forum/ViewPost.jtp?post=9094319&framed=y

          I will commit this if there are no other comments.

          Doron

          Show
          doronc Doron Cohen added a comment - Now working for Antony - http://www.nabble.com/forum/ViewPost.jtp?post=9094319&framed=y I will commit this if there are no other comments. Doron
          Hide
          michaelbusch Michael Busch added a comment -

          I tried it out, Doron. It works fine and all tests pass. I like the new tests in TestWildcard.

          Show
          michaelbusch Michael Busch added a comment - I tried it out, Doron. It works fine and all tests pass. I like the new tests in TestWildcard.
          Hide
          doronc Doron Cohen added a comment -

          Thanks for reviewing this Michael!

          Commited, with few additional wildcard queries in TestWildcard..testParsingAndSearching().

          Show
          doronc Doron Cohen added a comment - Thanks for reviewing this Michael! Commited, with few additional wildcard queries in TestWildcard..testParsingAndSearching().

            People

            • Assignee:
              doronc Doron Cohen
              Reporter:
              hossman Hoss Man
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development