Uploaded image for project: 'Lucene.Net'
  1. Lucene.Net
  2. LUCENENET-596

QueryParser produces a wrong query if KeywordRepeatFilter is used in analyzer

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Not A Problem
    • Affects Version/s: Lucene.Net 4.8.0
    • Fix Version/s: None
    • Labels:
      None

      Description

      Below is a code sample illustrating how to reproduce the issue:

                  var query = "+FieldName:Value_0";
                  var parser = new QueryParser(LuceneVersion.LUCENE_48, "FieldName", new CustomAnalyzer());
                  var res = parser.Parse(query); 
      
          class CustomAnalyzer : Analyzer
          {
              protected override TokenStreamComponents CreateComponents(string fieldName, TextReader reader)
              {
                  var tokenizer = new LetterOrDigitTokenizer(LuceneVersion.LUCENE_48, reader);
                 
                  TokenStream stream = new StandardFilter(LuceneVersion.LUCENE_48, tokenizer);
                
                  stream = new KeywordRepeatFilter(stream);
                 
                  return new TokenStreamComponents(tokenizer, stream);
              }
          }
      
          class LetterOrDigitTokenizer : CharTokenizer
          {
              public LetterOrDigitTokenizer(LuceneVersion matchVersion, TextReader input) : base(matchVersion, input)
              {
              }
      
              protected override bool IsTokenChar(int c)
              {
                  return char.IsLetterOrDigit((char)c);
              }
          }
      

      Result query is different in 3.0.3 and 4.8 versions:

      Lucene 3.0.3
      +FieldName:"(value value) 0"

      Lucene 4.8 beta 4
      +((FieldName:value FieldName:valu) FieldName:0)

      So if we have a document with FieldName == "0" (without the word "value"), it would be found with Lucene 4.8 anyway.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              hindikaynen Khindikaynen Aleksey
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: