Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
4.0-ALPHA
-
None
-
None
-
New
Description
This issue adds two new Codec implementations:
- TeeCodec: there have been attempts in the past to implement parallel writing to multiple indexes so that they are all synchronized. This was however complicated due to the complexity of IndexWriter/SegmentMerger logic. The solution presented here offers a similar functionality but working on a different level - as the name suggests, the TeeCodec duplicates index data into multiple output Directories.
- TeeDirectory (used also in TeeCodec) is a simple abstraction to perform Directory operations on several directories in parallel (effectively mirroring their data). Optionally it's possible to specify a set of suffixes of files that should be mirrored so that non-matching files are skipped.
- FilteringCodec is related in a remote way to the ideas of index pruning presented in
LUCENE-1812and the concept of tiered search. Since we can use TeeCodec to write to multiple output Directories in a synchronized way, we could also filter out or modify some of the data that is being written. The FilteringCodec provides this functionality, so that you can use like this:IndexWriter --> TeeCodec | | | +--> StandardCodec --> Directory1 +--> FilteringCodec --> StandardCodec --> Directory2
The end result of this chain is two indexes that are kept in sync - one is the full regular index, and the other one is a filtered index.