Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Not A Problem
    • Affects Version/s: 5.3.1
    • Fix Version/s: None
    • Component/s: core/search
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      i find when use SpanNotQuery and the exclud key word like "not" "or" will give a error result

      example:
      doc1:the quick brown fox jumps over the lazy dog
      doc2:the quick red fox jumps over the sleepy cat
      doc3:the quick brown fox jumps over the lazy NOT dog

      String queryStringStart = "dog";
      String queryStringEnd = "quick";
      String excludeString = "NOT";
      SpanQuery queryStart = new SpanTermQuery(new Term("text",queryStringStart));
      SpanQuery queryEnd = new SpanTermQuery(new Term("text",queryStringEnd));
      SpanQuery excludeQuery = new SpanTermQuery(new Term("text",excludeString));
      SpanQuery spanNearQuery = new SpanNearQuery(
      new SpanQuery[]

      {queryStart,queryEnd}

      , 7, false, false);

      SpanNotQuery spanNotQuery = new SpanNotQuery(spanNearQuery, excludeQuery, 4,3);

      then this will return doc1 and doc3. so i think it is a bug.

        Activity

        Hide
        jj380382856 jin jing added a comment -

        Have you ever encountered the same problem? What is the solution?

        Show
        jj380382856 jin jing added a comment - Have you ever encountered the same problem? What is the solution?
        Hide
        jpountz Adrien Grand added a comment -

        Probably that your analyzer performs lowercasing so you should actually try to match not, not NOT.

        Show
        jpountz Adrien Grand added a comment - Probably that your analyzer performs lowercasing so you should actually try to match not , not NOT .
        Hide
        jj380382856 jin jing added a comment -

        i use the lowcase not .but got a same result

        Show
        jj380382856 jin jing added a comment - i use the lowcase not .but got a same result
        Hide
        jpountz Adrien Grand added a comment -

        Could you provide a standalone test case that reproduces the bug?

        Show
        jpountz Adrien Grand added a comment - Could you provide a standalone test case that reproduces the bug?
        Hide
        jj380382856 jin jing added a comment -

        public static void main(String[] args) throws IOException {
        Directory dir = new RAMDirectory();
        Analyzer analyzer = new StandardAnalyzer();
        IndexWriterConfig iwc = new IndexWriterConfig(analyzer);
        iwc.setOpenMode(OpenMode.CREATE);
        IndexWriter writer = new IndexWriter(dir, iwc);

        Document doc = new Document();
        doc.add(new TextField("text", "the quick brown fox jumps over the lazy dog", Field.Store.YES));
        writer.addDocument(doc);

        doc = new Document();
        doc.add(new TextField("text", "the quick red fox jumps over the sleepy cat", Field.Store.YES));
        writer.addDocument(doc);

        doc = new Document();
        doc.add(new TextField("text", "the quick brown fox jumps over the lazy not dog", Field.Store.YES));
        writer.addDocument(doc);
        writer.close();
        IndexReader reader = DirectoryReader.open(dir);
        IndexSearcher searcher = new IndexSearcher(reader);
        String queryStringStart = "dog";
        String queryStringEnd = "quick";
        String excludeString = "not";
        SpanQuery queryStart = new SpanTermQuery(new Term("text",queryStringStart));
        SpanQuery queryEnd = new SpanTermQuery(new Term("text",queryStringEnd));
        SpanQuery excludeQuery = new SpanTermQuery(new Term("text",excludeString));
        SpanQuery spanNearQuery = new SpanNearQuery(
        new SpanQuery[]

        {queryStart,queryEnd}

        , 9, false, false);

        SpanNotQuery spanNotQuery = new SpanNotQuery(spanNearQuery, excludeQuery, 4,3);
        TopDocs results = searcher.search(spanNotQuery, null, 100);
        ScoreDoc[] scoreDocs = results.scoreDocs;

        for (int i = 0; i < scoreDocs.length; ++i)

        { int docID = scoreDocs[i].doc; Document document = searcher.doc(docID); String path = document.get("text"); System.out.println("text:" + path); }


        }

        Show
        jj380382856 jin jing added a comment - public static void main(String[] args) throws IOException { Directory dir = new RAMDirectory(); Analyzer analyzer = new StandardAnalyzer(); IndexWriterConfig iwc = new IndexWriterConfig(analyzer); iwc.setOpenMode(OpenMode.CREATE); IndexWriter writer = new IndexWriter(dir, iwc); Document doc = new Document(); doc.add(new TextField("text", "the quick brown fox jumps over the lazy dog", Field.Store.YES)); writer.addDocument(doc); doc = new Document(); doc.add(new TextField("text", "the quick red fox jumps over the sleepy cat", Field.Store.YES)); writer.addDocument(doc); doc = new Document(); doc.add(new TextField("text", "the quick brown fox jumps over the lazy not dog", Field.Store.YES)); writer.addDocument(doc); writer.close(); IndexReader reader = DirectoryReader.open(dir); IndexSearcher searcher = new IndexSearcher(reader); String queryStringStart = "dog"; String queryStringEnd = "quick"; String excludeString = "not"; SpanQuery queryStart = new SpanTermQuery(new Term("text",queryStringStart)); SpanQuery queryEnd = new SpanTermQuery(new Term("text",queryStringEnd)); SpanQuery excludeQuery = new SpanTermQuery(new Term("text",excludeString)); SpanQuery spanNearQuery = new SpanNearQuery( new SpanQuery[] {queryStart,queryEnd} , 9, false, false); SpanNotQuery spanNotQuery = new SpanNotQuery(spanNearQuery, excludeQuery, 4,3); TopDocs results = searcher.search(spanNotQuery, null, 100); ScoreDoc[] scoreDocs = results.scoreDocs; for (int i = 0; i < scoreDocs.length; ++i) { int docID = scoreDocs[i].doc; Document document = searcher.doc(docID); String path = document.get("text"); System.out.println("text:" + path); } }
        Hide
        romseygeek Alan Woodward added a comment -

        "not" is a stopword, and is removed by the StandardAnalyzer by default.

        Show
        romseygeek Alan Woodward added a comment - "not" is a stopword, and is removed by the StandardAnalyzer by default.
        Hide
        jj380382856 jin jing added a comment -

        yes it is.thank you

        Show
        jj380382856 jin jing added a comment - yes it is.thank you

          People

          • Assignee:
            Unassigned
            Reporter:
            jj380382856 jin jing
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development