Index: lucene/CHANGES.txt =================================================================== --- lucene/CHANGES.txt (revision 1243036) +++ lucene/CHANGES.txt (working copy) @@ -78,6 +78,19 @@ (Mike McCandless, Robert Muir, Uwe Schindler, Mark Miller, Michael Busch) +* LUCENE-2858, LUCENE-3733: IndexReader was refactored into abstract + AtomicReader, CompositeReader, and DirectoryReader. To open Directory- + based indexes use DirectoryReader.open(), the corresponding method in + IndexReader is now deprecated for easier migration. Only DirectoryReader + supports commits, versions, and reopening with openIfChanged(). Terms, + postings, docvalues, and norms can from now on only be retrieved using + AtomicReader; DirectoryReader and MultiReader extend CompositeReader, + only offering stored fields and access to the sub-readers (which may be + composite or atomic). SlowCompositeReaderWrapper (LUCENE-2597) can be + used to emulate atomic readers on top of composites. + Please review MIGRATE.txt for information how to migrate old code. + (Uwe Schindler, Robert Muir, Mike McCandless) + * LUCENE-2265: FuzzyQuery and WildcardQuery now operate on Unicode codepoints, not unicode code units. For example, a Wildcard "?" represents any unicode character. Furthermore, the rest of the automaton package and RegexpQuery use @@ -98,7 +111,7 @@ and TermToBytesRefAttribute instead. (Uwe Schindler) * LUCENE-2600: Remove IndexReader.isDeleted in favor of - IndexReader.getDeletedDocs(). (Mike McCandless) + AtomicReader.getDeletedDocs(). (Mike McCandless) * LUCENE-2667: FuzzyQuery's defaults have changed for more performant behavior: the minimum similarity is 2 edit distances from the word, @@ -117,10 +130,6 @@ need to change it (e.g. using "\\" to escape '\' itself). (Sunil Kamath, Terry Yang via Robert Muir) -* LUCENE-2771: IndexReader.norms() now throws UOE on non-atomic IndexReaders. If - you really want a top-level norms, use MultiNorms or SlowMultiReaderWrapper. - (Uwe Schindler, Robert Muir) - * LUCENE-2837: Collapsed Searcher, Searchable into IndexSearcher; removed contrib/remote and MultiSearcher (Mike McCandless); absorbed ParallelMultiSearcher into IndexSearcher as an optional @@ -189,9 +198,9 @@ with the old tokenStream() method removed. Consequently it is now mandatory for all Analyzers to support reusability. (Chris Male) -* LUCENE-3473: IndexReader.getUniqueTermCount() no longer throws UOE when - it cannot be easily determined (e.g. Multi*Readers). Instead, it returns - -1 to be consistent with this behavior across other index statistics. +* LUCENE-3473: AtomicReader.getUniqueTermCount() no longer throws UOE when + it cannot be easily determined. Instead, it returns -1 to be consistent with + this behavior across other index statistics. (Robert Muir) * LUCENE-1536: The abstract FilteredDocIdSet.match() method is no longer @@ -207,18 +216,18 @@ * LUCENE-3533: Removed SpanFilters, they created large lists of objects and did not scale. (Robert Muir) -* LUCENE-3606: IndexReader was made read-only. It is no longer possible to - delete or undelete documents using IndexReader; you have to use IndexWriter - now. As deleting by internal Lucene docID is no longer possible, this - requires adding a unique identifier field to your index. Deleting/relying - upon Lucene docIDs is not recommended anyway, because they can change. - Consequently commit() was removed and IndexReader.open(), openIfChanged(), - and clone() no longer take readOnly booleans or IndexDeletionPolicy +* LUCENE-3606: IndexReader and subclasses were made read-only. It is no longer + possible to delete or undelete documents using IndexReader; you have to use + IndexWriter now. As deleting by internal Lucene docID is no longer possible, + this requires adding a unique identifier field to your index. Deleting/ + relying upon Lucene docIDs is not recommended anyway, because they can + change. Consequently commit() was removed and DirectoryReader.open(), + openIfChanged() no longer take readOnly booleans or IndexDeletionPolicy instances. Furthermore, IndexReader.setNorm() was removed. If you need customized norm values, the recommended way to do this is by modifying Similarity to use an external byte[] or one of the new DocValues fields (LUCENE-3108). Alternatively, to dynamically change norms (boost - *and* length norm) at query time, wrap your IndexReader using + *and* length norm) at query time, wrap your AtomicReader using FilterIndexReader, overriding FilterIndexReader.norms(). To persist the changes on disk, copy the FilteredIndexReader to a new index using IndexWriter.addIndexes(). (Uwe Schindler, Robert Muir) @@ -231,13 +240,10 @@ FieldInfo.IndexOption: DOCS_AND_POSITIONS_AND_OFFSETS. (Robert Muir, Mike McCandless) -* LUCENE-3646: FieldCacheImpl now throws UOE on non-atomic IndexReaders. If - you really want a top-level fieldcache, use SlowMultiReaderWrapper. - (Robert Muir) - -* LUCENE-2858, LUCENE-3733: IndexReader was refactored into abstract - AtomicReader, CompositeReader, and DirectoryReader. TODO:add more info - (Uwe Schindler, Mike McCandless, Robert Muir) +* LUCENE-2858: FilterIndexReader now extends AtomicReader. If you want to + filter composite readers like DirectoryReader or MultiReader, filter + their atomic leaves and build a new CompositeReader (e.g. MultiReader) + around them. (Uwe Schindler, Robert Muir) * LUCENE-3736: ParallelReader was split into ParallelAtomicReader and ParallelCompositeReader. Lucene 3.x's ParallelReader is now @@ -335,9 +341,6 @@ (Mike McCandless, Michael Busch, Simon Willnauer) -* LUCENE-3146: IndexReader.setNorm throws IllegalStateException if the field - does not store norms. (Shai Erera, Mike McCandless) - * LUCENE-3309: Stored fields no longer record whether they were tokenized or not. In general you should not rely on stored fields to record any "metadata" from indexing (tokenized, omitNorms, @@ -354,12 +357,12 @@ and iterated as byte[] (wrapped in a BytesRef) by IndexReader for searching. -* LUCENE-1458, LUCENE-2111: IndexReader now directly exposes its +* LUCENE-1458, LUCENE-2111: AtomicReader now directly exposes its deleted docs (getDeletedDocs), providing a new Bits interface to directly query by doc ID. * LUCENE-2691: IndexWriter.getReader() has been made package local and is now - exposed via open and reopen methods on IndexReader. The semantics of the + exposed via open and reopen methods on DirectoryReader. The semantics of the call is the same as it was prior to the API change. (Grant Ingersoll, Mike McCandless) @@ -370,14 +373,6 @@ Collector#setNextReader & FieldComparator#setNextReader now expect an AtomicReaderContext instead of an IndexReader. (Simon Willnauer) -* LUCENE-2846: Remove the deprecated IndexReader.setNorm(int, String, float). - This method was only syntactic sugar for setNorm(int, String, byte), but - using the global Similarity.getDefault().encodeNormValue. Use the byte-based - method instead to ensure that the norm is encoded with your Similarity. - Also removed norms(String, byte[], int), which was only used by MultiReader - for building top-level norms. If you really need a top-level norms, use - MultiNorms or SlowMultiReaderWrapper. (Robert Muir, Mike Mccandless) - * LUCENE-2892: Add QueryParser.newFieldQuery (called by getFieldQuery by default) which takes Analyzer as a parameter, for easier customization by subclasses. (Robert Muir) @@ -485,7 +480,7 @@ Sep*, makes it simple to take any variable sized int block coders (like Simple9/16) and use them in a codec. (Mike McCandless) -* LUCENE-2597: Add oal.index.SlowMultiReaderWrapper, to wrap a +* LUCENE-2597: Add oal.index.SlowCompositeReaderWrapper, to wrap a composite reader (eg MultiReader or DirectoryReader), making it pretend it's an atomic reader. This is a convenience class (you can use MultiFields static methods directly, instead) if you need to use @@ -621,7 +616,7 @@ * LUCENE-1536: Filters can now be applied down-low, if their DocIdSet implements a new bits() method, returning all documents in a random access way. If the DocIdSet is not too sparse, it will be passed as acceptDocs down to the Scorer - as replacement for IndexReader's live docs. + as replacement for AtomicReader's live docs. In addition, FilteredQuery backs now IndexSearcher's filtering search methods. Using FilteredQuery you can chain Filters in a very performant way [new FilteredQuery(new FilteredQuery(query, filter1), filter2)], which was not @@ -635,7 +630,7 @@ load only certain fields when loading a document. (Peter Chang via Mike McCandless) -* LUCENE-3628: Norms are represented as DocValues. IndexReader exposes +* LUCENE-3628: Norms are represented as DocValues. AtomicReader exposes a #normValues(String) method to obtain norms per field. (Simon Willnauer) * LUCENE-3687: Similarity#computeNorm(FieldInvertState, Norm) allows to compute Index: lucene/core/src/java/org/apache/lucene/search/package.html =================================================================== --- lucene/core/src/java/org/apache/lucene/search/package.html (revision 1243036) +++ lucene/core/src/java/org/apache/lucene/search/package.html (working copy) @@ -326,8 +326,8 @@ Weight#normalize(float) — Determine the query normalization factor. The query normalization may allow for comparing scores between queries.