Lucene - Core
  1. Lucene - Core
  2. LUCENE-3442

QueryWrapperFilter gets null DocIdSetIterator when wrapping TermQuery

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 3.1, 3.2, 3.3, 3.4
    • Fix Version/s: 3.5
    • Component/s: core/search
    • Labels:
      None
    • Environment:

      java 1.6.0_27

    • Lucene Fields:
      New

      Description

      If you try to get the iterator for the DocIdSet returned by a QueryWrapperFilter which wraps a TermQuery you get null instead of an iterator that returns the same documents as the search on the TermQuery.

      Code demonstrating the issue:

      import java.io.IOException;
      import org.apache.lucene.analysis.WhitespaceAnalyzer;
      import org.apache.lucene.document.Document;
      import org.apache.lucene.document.Field;
      import org.apache.lucene.document.Field.Index;
      import org.apache.lucene.document.Field.Store;
      import org.apache.lucene.index.IndexReader;
      import org.apache.lucene.index.IndexWriter;
      import org.apache.lucene.index.IndexWriterConfig;
      import org.apache.lucene.index.Term;
      import org.apache.lucene.store.RAMDirectory;
      import org.apache.lucene.util.Version;
      import org.apache.lucene.search.DocIdSet;
      import org.apache.lucene.search.DocIdSetIterator;
      import org.apache.lucene.search.Filter;
      import org.apache.lucene.search.IndexSearcher;
      import org.apache.lucene.search.QueryWrapperFilter;
      import org.apache.lucene.search.TermQuery;
      import org.apache.lucene.search.TopDocs;
      
      public class TestQueryWrapperFilterIterator {
         public static void main(String[] args) {
      		try {
      			IndexWriterConfig iwconfig = new IndexWriterConfig(Version.LUCENE_34, new WhitespaceAnalyzer(Version.LUCENE_34));
      			iwconfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
      			RAMDirectory dir = new RAMDirectory();
      		
      			IndexWriter writer = new IndexWriter(dir, iwconfig);
      			Document d = new Document();
      			d.add(new Field("id", "1001", Store.YES, Index.NOT_ANALYZED));
      			d.add(new Field("text", "headline one group one", Store.YES, Index.ANALYZED));
      			d.add(new Field("group", "grp1", Store.YES, Index.NOT_ANALYZED));
      		    writer.addDocument(d);
      			writer.commit();
      			writer.close();
      			
      			IndexReader rdr = IndexReader.open(dir);
      			IndexSearcher searcher = new IndexSearcher(rdr);
      			
      			TermQuery tq = new TermQuery(new Term("text", "headline"));
      			
      			TopDocs results = searcher.search(tq, 5);
      			System.out.println("Number of search results: " + results.totalHits);
      			
      			Filter f = new QueryWrapperFilter(tq);
      			
      			DocIdSet dis = f.getDocIdSet(rdr);
      			
      			DocIdSetIterator it = dis.iterator();
      			if (it != null) {
      				int docId = it.nextDoc();
      				while (docId != DocIdSetIterator.NO_MORE_DOCS) {
      					Document doc = rdr.document(docId);
      					System.out.println("Iterator doc: " + doc.get("id"));
      					docId = it.nextDoc();
      				}
      			} else {
      				System.out.println("Iterator was null: ");
      			}
      			
      			searcher.close();
      			rdr.close();
      		} catch (IOException ioe) {
      			ioe.printStackTrace();
      		}
      
      	}
      }
      
      1. LUCENE-3442.patch
        3 kB
        Uwe Schindler

        Activity

        Dan Climan created issue -
        Uwe Schindler made changes -
        Field Original Value New Value
        Assignee Uwe Schindler [ thetaphi ]
        Hide
        Uwe Schindler added a comment -

        The issue lies in the fact that an optimization in TermQuery prevents it's Weight.scorer() method to behave correctly when no atomic reader is passed in. This is no longer supported in Lucene trunk, but in 3.x the weight should still be able to work on composite readers. The sample code provided does this exactly: It calls QWF.getDocIdSet on a non-atomic IndexReader. QWF calls TermWeight.scorer() and this one returns null, because the composite reader is not in its DF cache.

        The fix is easy: Don't early exit in scorer() if the reader passed in is not atomic.

        Show
        Uwe Schindler added a comment - The issue lies in the fact that an optimization in TermQuery prevents it's Weight.scorer() method to behave correctly when no atomic reader is passed in. This is no longer supported in Lucene trunk, but in 3.x the weight should still be able to work on composite readers. The sample code provided does this exactly: It calls QWF.getDocIdSet on a non-atomic IndexReader. QWF calls TermWeight.scorer() and this one returns null, because the composite reader is not in its DF cache. The fix is easy: Don't early exit in scorer() if the reader passed in is not atomic.
        Uwe Schindler made changes -
        Fix Version/s 3.5 [ 12317877 ]
        Description If you try to get the iterator for the DocIdSet returned by a QueryWrapperFilter which wraps a TermQuery you get null instead of an iterator that returns the same documents as the search on the TermQuery.

        Code demonstrating the issue:


        import java.io.IOException;
        import org.apache.lucene.analysis.WhitespaceAnalyzer;
        import org.apache.lucene.document.Document;
        import org.apache.lucene.document.Field;
        import org.apache.lucene.document.Field.Index;
        import org.apache.lucene.document.Field.Store;
        import org.apache.lucene.index.IndexReader;
        import org.apache.lucene.index.IndexWriter;
        import org.apache.lucene.index.IndexWriterConfig;
        import org.apache.lucene.index.Term;
        import org.apache.lucene.store.RAMDirectory;
        import org.apache.lucene.util.Version;
        import org.apache.lucene.search.DocIdSet;
        import org.apache.lucene.search.DocIdSetIterator;
        import org.apache.lucene.search.Filter;
        import org.apache.lucene.search.IndexSearcher;
        import org.apache.lucene.search.QueryWrapperFilter;
        import org.apache.lucene.search.TermQuery;
        import org.apache.lucene.search.TopDocs;

        public class TestQueryWrapperFilterIterator {
           public static void main(String[] args) {
        try {
        IndexWriterConfig iwconfig = new IndexWriterConfig(Version.LUCENE_34, new WhitespaceAnalyzer(Version.LUCENE_34));
        iwconfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
        RAMDirectory dir = new RAMDirectory();

        IndexWriter writer = new IndexWriter(dir, iwconfig);
        Document d = new Document();
        d.add(new Field("id", "1001", Store.YES, Index.NOT_ANALYZED));
        d.add(new Field("text", "headline one group one", Store.YES, Index.ANALYZED));
        d.add(new Field("group", "grp1", Store.YES, Index.NOT_ANALYZED));
        writer.addDocument(d);
        writer.commit();
        writer.close();

        IndexReader rdr = IndexReader.open(dir);
        IndexSearcher searcher = new IndexSearcher(rdr);

        TermQuery tq = new TermQuery(new Term("text", "headline"));

        TopDocs results = searcher.search(tq, 5);
        System.out.println("Number of search results: " + results.totalHits);

        Filter f = new QueryWrapperFilter(tq);

        DocIdSet dis = f.getDocIdSet(rdr);

        DocIdSetIterator it = dis.iterator();
        if (it != null) {
        int docId = it.nextDoc();
        while (docId != DocIdSetIterator.NO_MORE_DOCS) {
        Document doc = rdr.document(docId);
        System.out.println("Iterator doc: " + doc.get("id"));
        docId = it.nextDoc();
        }
        } else {
        System.out.println("Iterator was null: ");
        }

        searcher.close();
        rdr.close();
        } catch (IOException ioe) {
        ioe.printStackTrace();
        }

        }
        }
        If you try to get the iterator for the DocIdSet returned by a QueryWrapperFilter which wraps a TermQuery you get null instead of an iterator that returns the same documents as the search on the TermQuery.

        Code demonstrating the issue:

        {code:java}
        import java.io.IOException;
        import org.apache.lucene.analysis.WhitespaceAnalyzer;
        import org.apache.lucene.document.Document;
        import org.apache.lucene.document.Field;
        import org.apache.lucene.document.Field.Index;
        import org.apache.lucene.document.Field.Store;
        import org.apache.lucene.index.IndexReader;
        import org.apache.lucene.index.IndexWriter;
        import org.apache.lucene.index.IndexWriterConfig;
        import org.apache.lucene.index.Term;
        import org.apache.lucene.store.RAMDirectory;
        import org.apache.lucene.util.Version;
        import org.apache.lucene.search.DocIdSet;
        import org.apache.lucene.search.DocIdSetIterator;
        import org.apache.lucene.search.Filter;
        import org.apache.lucene.search.IndexSearcher;
        import org.apache.lucene.search.QueryWrapperFilter;
        import org.apache.lucene.search.TermQuery;
        import org.apache.lucene.search.TopDocs;

        public class TestQueryWrapperFilterIterator {
           public static void main(String[] args) {
        try {
        IndexWriterConfig iwconfig = new IndexWriterConfig(Version.LUCENE_34, new WhitespaceAnalyzer(Version.LUCENE_34));
        iwconfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
        RAMDirectory dir = new RAMDirectory();

        IndexWriter writer = new IndexWriter(dir, iwconfig);
        Document d = new Document();
        d.add(new Field("id", "1001", Store.YES, Index.NOT_ANALYZED));
        d.add(new Field("text", "headline one group one", Store.YES, Index.ANALYZED));
        d.add(new Field("group", "grp1", Store.YES, Index.NOT_ANALYZED));
        writer.addDocument(d);
        writer.commit();
        writer.close();

        IndexReader rdr = IndexReader.open(dir);
        IndexSearcher searcher = new IndexSearcher(rdr);

        TermQuery tq = new TermQuery(new Term("text", "headline"));

        TopDocs results = searcher.search(tq, 5);
        System.out.println("Number of search results: " + results.totalHits);

        Filter f = new QueryWrapperFilter(tq);

        DocIdSet dis = f.getDocIdSet(rdr);

        DocIdSetIterator it = dis.iterator();
        if (it != null) {
        int docId = it.nextDoc();
        while (docId != DocIdSetIterator.NO_MORE_DOCS) {
        Document doc = rdr.document(docId);
        System.out.println("Iterator doc: " + doc.get("id"));
        docId = it.nextDoc();
        }
        } else {
        System.out.println("Iterator was null: ");
        }

        searcher.close();
        rdr.close();
        } catch (IOException ioe) {
        ioe.printStackTrace();
        }

        }
        }
        {code}
        Hide
        Uwe Schindler added a comment -

        Patch that fixes the issue. It also contains the testcase provided by Dan C.

        Will commit soon.

        Show
        Uwe Schindler added a comment - Patch that fixes the issue. It also contains the testcase provided by Dan C. Will commit soon.
        Uwe Schindler made changes -
        Attachment LUCENE-3442.patch [ 12495259 ]
        Hide
        Uwe Schindler added a comment -

        The issue is caused by LUCENE-2829, committed January and exists since Lucene 3.1.

        Show
        Uwe Schindler added a comment - The issue is caused by LUCENE-2829 , committed January and exists since Lucene 3.1.
        Uwe Schindler made changes -
        Affects Version/s 3.3 [ 12316470 ]
        Affects Version/s 3.2 [ 12316070 ]
        Affects Version/s 3.1 [ 12314822 ]
        Hide
        Uwe Schindler added a comment -

        Committed 3.x branch revision: 1173311

        Show
        Uwe Schindler added a comment - Committed 3.x branch revision: 1173311
        Uwe Schindler made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Uwe Schindler added a comment -

        Bulk close after release of 3.5

        Show
        Uwe Schindler added a comment - Bulk close after release of 3.5
        Uwe Schindler made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        30m 13s 1 Uwe Schindler 20/Sep/11 20:16
        Resolved Resolved Closed Closed
        67d 17h 13m 1 Uwe Schindler 27/Nov/11 12:29

          People

          • Assignee:
            Uwe Schindler
            Reporter:
            Dan Climan
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development