Issue Details (XML | Word | Printable)

Key: LUCENE-1313
Type: New Feature New Feature
Status: Open Open
Priority: Minor Minor
Assignee: Unassigned
Reporter: Jason Rutherglen
Votes: 1
Watchers: 17
Operations

If you were logged in you would be able to see more operations.
Lucene - Java

Near Realtime Search (using a built in RAMDirectory)

Created: 22/Jun/08 05:02 PM   Updated: Tuesday 11:53 PM
Return to search
Component/s: Index
Affects Version/s: 2.4.1
Fix Version/s: 3.1

Time Tracking:
Not Specified

File Attachments:
  Size
Java Archive File Licensed for inclusion in ASF works LUCENE-1313.jar 2009-04-08 12:22 AM Jason Rutherglen 5 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-11-24 11:53 PM Jason Rutherglen 35 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-11-09 08:47 PM Jason Rutherglen 60 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-11-05 09:43 PM Jason Rutherglen 39 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-11-05 07:33 PM Jason Rutherglen 37 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-11-05 06:34 AM Jason Rutherglen 36 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-11-05 02:15 AM Jason Rutherglen 22 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-11-05 02:05 AM Jason Rutherglen 22 kB
Text File LUCENE-1313.patch 2009-11-04 07:51 PM Jason Rutherglen 14 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-11-03 12:37 AM Jason Rutherglen 36 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-11-02 07:04 PM Jason Rutherglen 33 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-09-23 02:11 AM Jason Rutherglen 175 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-09-23 02:03 AM Jason Rutherglen 166 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-06-30 09:28 PM Jason Rutherglen 175 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-06-22 10:16 PM Jason Rutherglen 175 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-06-18 10:52 PM Jason Rutherglen 173 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-06-18 12:55 AM Jason Rutherglen 170 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-06-05 04:44 AM Jason Rutherglen 131 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-05-19 09:59 PM Jason Rutherglen 129 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-05-12 03:20 AM Jason Rutherglen 109 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-05-05 12:32 AM Jason Rutherglen 69 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-05-01 06:38 PM Jason Rutherglen 47 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-04-30 11:19 PM Jason Rutherglen 52 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-04-30 09:59 PM Jason Rutherglen 49 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-04-30 08:25 PM Jason Rutherglen 37 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-04-20 09:09 PM Jason Rutherglen 21 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-04-17 08:27 PM Jason Rutherglen 14 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2009-04-01 10:48 PM Jason Rutherglen 12 kB
Text File Licensed for inclusion in ASF works LUCENE-1313.patch 2008-10-01 06:08 PM Jason Rutherglen 473 kB
Text File Licensed for inclusion in ASF works lucene-1313.patch 2008-07-17 02:35 PM Jason Rutherglen 474 kB
Text File Licensed for inclusion in ASF works lucene-1313.patch 2008-06-24 09:31 PM Jason Rutherglen 698 kB
Text File Licensed for inclusion in ASF works lucene-1313.patch 2008-06-24 12:21 AM Jason Rutherglen 680 kB
Text File Licensed for inclusion in ASF works lucene-1313.patch 2008-06-22 05:06 PM Jason Rutherglen 1.88 MB
Issue Links:
Blocker
 
Dependants
 
Reference
 
dependent
 

Lucene Fields: Patch Available, New


 Description  « Hide
Enable near realtime search in Lucene without external
dependencies. When RAM NRT is enabled, the implementation adds a
RAMDirectory to IndexWriter. Flushes go to the ramdir unless
there is no available space. Merges are completed in the ram
dir until there is no more available ram.

IW.optimize and IW.commit flush the ramdir to the primary
directory, all other operations try to keep segments in ram
until there is no more space.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Jason Rutherglen added a comment - 22/Jun/08 05:06 PM
lucene-1313.patch

Patch includes libraries:

  • commons-io-1.3.2.jar
  • commons-lang-2.3.jar
  • jdom.jar
  • slf4j-api-1.5.2.jar
  • slf4j-simple-1.5.2.jar
  • source from http://reader.imagero.com/uio/
  • source from net.sourceforge.jsorter

Jason Rutherglen made changes - 22/Jun/08 05:06 PM
Field Original Value New Value
Attachment lucene-1313.patch [ 12384453 ]
Jason Rutherglen added a comment - 24/Jun/08 12:21 AM
lucene-1313.patch

Removed http://reader.imagero.com/uio/ code as it routinely corrupted the log. It's replacement is RandomAccessFile. Added MultiThreadSearcherPolicy that is used to created a multi threaded Searcher. Transaction multithreading has been removed because it makes it hard to debug. It will be optional in the future. Many bugs have been fixed. TestSearch tests for deletes. Index directories are now of the form "2_index", the index id and the suffix "_index".


Jason Rutherglen made changes - 24/Jun/08 12:21 AM
Attachment lucene-1313.patch [ 12384545 ]
Jason Rutherglen added a comment - 24/Jun/08 09:31 PM
lucene-1313.patch

Depends on LUCENE-1312 and LUCENE-1314. More bugs fixed. Deletes are committed to indexes only intermittently which improves the update speed dramatically. MaybeMergeIndexes now runs via a background timer.

Will remove writing a snapshot.xml file per transaction in favor of a human readable log. Creating and deleting these small files is a bottleneck for update speed. This way a transaction writes to 2 files only. The merges happen in the background and so never affect the transaction update speed. I am not sure how useful it would be, but it is possible to have a priority based IO system that favors transactions over merges. If a transaction is coming in and a merge is happening to disk, the merge is stopped and the transaction IO runs, then the merge IO continues.

I am not sure how to handle Documents with Fields that have a TokenStream as the value as I believe these cannot be serialized. For now I assume it will be unsupported.

Also not sure how to handle analyzers, are these generally serializable? It would be useful to serialize them for a more automated log recovery process.


Jason Rutherglen made changes - 24/Jun/08 09:31 PM
Attachment lucene-1313.patch [ 12384629 ]
Jason Rutherglen added a comment - 17/Jul/08 02:35 PM
lucene-1313.patch
  • Depends on LUCENE-1314
  • OceanSegmentReader implements reuse of deletedDocs bytes in conjunction with LUCENE-1314
  • Snapshot logging happens to a rolling log file
  • CRC32 checking added to transaction log
  • Added TestSystem test case that performs adds, updates and deletes. TestSystem uses arbitrarily small settings numbers to force the various background merges to happen within a minimal number of transactions
  • Transactions with over N documents encoded into a segment (via RAMDirectory) to the transaction log rather than serialized as a Document
  • Started wiki page http://wiki.apache.org/lucene-java/OceanRealtimeSearch linked from http://wiki.apache.org/lucene-java/LuceneResources. Will place documentation there.
  • Document fields with Reader or TokenStream values supported

Began work on LargeBatch functionality, needs test case. Large batches allow adding documents in bulk (also performing deletes) in a transaction that goes straight to an index bypassing the transaction log. This provides the same speed as using IndexWriter to perform bulk Document processing in Ocean.

Started OceanDatabase which will offer a Java API inspired by GData. Will offer optimistic concurrency (something required in a realtime search system) and dynamic object mapping (meaning types such as long, date, double will be mapped to a string term using some Solr code). A file sync is performed after each transaction, will add an option to allow syncing after N transactions like mysql. This will improve realtime update speeds.

Future:

  • Support for multiple servers by implementing distributed API and replication using LUCENE-1336
  • Test case that is akin to TestStressIndexing2 mainly to test threading
  • Add LargeBatch test to TestSystem
  • Facets
  • Looking at adding GData compatible XML over HTTP API. Possibly can reuse the old Lucene GData code.
  • Integrate tag index when it's completed
  • Add LRU record cache to transaction log which will be useful for faster replication

Jason Rutherglen made changes - 17/Jul/08 02:35 PM
Attachment lucene-1313.patch [ 12386303 ]
Karl Wettin added a comment - 02/Sep/08 09:05 AM
Hi Jason,

I took an inital look at your code last night. Didn't actually execute anything, just followed method calls around to see what it was up to.

My first comment is sort of boring, but there are virtually no javadocs for the core classes such as TransactionSystem, Batch and Index. It would be great if there was a bit at the class level exaplaining what classes they interact with and how. It would also be very helpful if there was method level javadocs for at least the top level commit related logic.

One thing that early cought my attention is this method in TransactionSystem:

public OceanSearcher getSearcher() throws IOException {
    Snapshot snapshot = snapshots.getLatestSnapshot();
    if (searcherPolicy instanceof SingleThreadSearcherPolicy) {
      return new OceanSearcher(snapshot);
    } else {
      return new OceanMultiThreadSearcher(snapshot, searchThreadPool);
    }
  }

Am I supposed to call this method for each query (as suggested by the method name) or is this a factory method used to update my own Searcher instance after committing documents to the index (as suggested by the code)?

It's not such a big deal, but I personally think you should refactor the instanceOf to a Policy.searcherFactory method, or perhaps even a SearcherPolicyVisitor. Actually, this goes for a few other places in the module too: you have used instanceOf and unchecked casting a bit more extensive to solve problems than what I would have. But as it does not seem to be used in places where it would be a costrly thing to do these comments are mearly about code readability and gut feelings about future problems.

I'm a bit concerned about the potential loss of data while documents only resides in InstantiatedIndex or RAMDirectory. I think I'd like an option on some sort of transaction log that could be played up in case of a crash. I think the easiset way would be to convert all documents to be pre-analyzed (field.tokenStream) before passing them on to the instantiated writer. I don't know how much resources that might consume, but it would make me feel safer.

karl


Jason Rutherglen added a comment - 02/Sep/08 12:05 PM
Hi Karl,

Thanks for taking a look at the code! Yes the methods need javadoc, I was waiting to see if I had settled on them, and because I started building new code on top I guess the methods have settled so I need to add javadoc to them.

If you are using TransactionSystem then the getSearcher method would be called for each query. I have developed OceanDatabase which makes the searching transparent and implements optimistic concurrency (version number stored in the document). I believe most systems will want to use OceanDatabase, however the raw TransactionSystem which is more like IndexWriter will be left as well. I have been working on OceanDatabase and have neglected the javadocs of TransactionSystem.

I modeled the searcherPolicy instanceof code on the MergeScheduler type of system where there is a marker interface that the subclasses implement. I don't mind changing it, or if you want to you can as well. I considered it a minor detail though and admittedly did not spend much time on it. You are welcome to change it.

The transaction log is replayed on a restart of the system. It repopulates a RamIndex (uses RAMDirectory) on startup based on the max snapshot id of the existing indexes, and replays the transaction log from there.

I looked at converting documents to a token stream, the problem is, if the field is stored, it creates redundant storage of the data in the transaction log. Ultimately I could not find anything to be gained from storing a token stream. Also if it was converted, what would happen with stored fields? The issue with replaying the document later though is not having the Analyzer. In the distributed object code patch LUCENE-1336 I made Analyzer Serializable. I think it's best to serialize the Analyzer, or create a small database of serialized analyzers that can be called upon during the transaction log recovery process. Because I am not entirely sure about the ramifications of serializing the Analyzer, for example, how much data a serialized Analyzer may have. Perhaps other have some ideas or feedback about serializing analyzers.

In conclusion, I'll add more javadocs. Please feel free to ask more questions!

Jason


Jason Rutherglen added a comment - 03/Sep/08 06:42 PM
Is there a good place to place the javadocs on the Apache website once they are more complete?

Jason Rutherglen added a comment - 01/Oct/08 06:08 PM
LUCENE-1313.patch

Added javadocs. Still needs the LUCENE-1314 completed which will be divided into multiple patches.


Jason Rutherglen made changes - 01/Oct/08 06:08 PM
Attachment LUCENE-1313.patch [ 12391299 ]
Jason Rutherglen made changes - 01/Apr/09 09:06 PM
Component/s contrib/* [ 12312028 ]
Component/s Index [ 12310232 ]
Fix Version/s 2.9 [ 12312682 ]
Priority Major [ 3 ] Minor [ 4 ]
Description Provides realtime search using Lucene. Conceptually, updates are divided into discrete transactions. The transaction is recorded to a transaction log which is similar to the mysql bin log. Deletes from the transaction are made to the existing indexes. Document additions are made to an in memory InstantiatedIndex. The transaction is then complete. After each transaction TransactionSystem.getSearcher() may be called which allows searching over the index including the latest transaction.

TransactionSystem is the main class. Methods similar to IndexWriter are provided for updating. getSearcher returns a Searcher class.

- getSearcher()
- addDocument(Document document)
- addDocument(Document document, Analyzer analyzer)
- updateDocument(Term term, Document document)
- updateDocument(Term term, Document document, Analyzer analyzer)
- deleteDocument(Term term)
- deleteDocument(Query query)
- commitTransaction(List<Document> documents, Analyzer analyzer, List<Term> deleteByTerms, List<Query> deleteByQueries)

Sample code:

{code}
// setup
FSDirectoryMap directoryMap = new FSDirectoryMap(new File("/testocean"), "log");
LogDirectory logDirectory = directoryMap.getLogDirectory();
TransactionLog transactionLog = new TransactionLog(logDirectory);
TransactionSystem system = new TransactionSystem(transactionLog, new SimpleAnalyzer(), directoryMap);

// transaction
Document d = new Document();
d.add(new Field("contents", "hello world", Field.Store.YES, Field.Index.TOKENIZED));
system.addDocument(d);

// search
OceanSearcher searcher = system.getSearcher();
ScoreDoc[] hits = searcher.search(query, null, 1000).scoreDocs;
System.out.println(hits.length + " total results");
for (int i = 0; i < hits.length && i < 10; i++) {
  Document d = searcher.doc(hits[i].doc);
  System.out.println(i + " " + hits[i].score+ " " + d.get("contents");
}
{code}

There is a test class org.apache.lucene.ocean.TestSearch that was used for basic testing.

A sample disk directory structure is as follows:

|/snapshot_105_00.xml | XML file containing which indexes and their generation numbers correspond to a snapshot. Each transaction creates a new snapshot file. In this file the 105 is the snapshotid, also known as the transactionid. The 00 is the minor version of the snapshot corresponding to a merge. A merge is a minor snapshot version because the data does not change, only the underlying structure of the index|
|/3 | Directory containing an on disk Lucene index|
|/log | Directory containing log files|
|/log/log00000001.bin | Log file. As new log files are created the suffix number is incremented|

Realtime search with transactional semantics.

Possible future directions:
  * Optimistic concurrency
  * Replication

Encoding each transaction into a set of bytes by writing to a RAMDirectory enables replication. It is difficult to replicate using other methods because while the document may easily be serialized, the analyzer cannot.

I think this issue can hold realtime benchmarks which include indexing and searching concurrently.
Affects Version/s 2.4.1 [ 12313516 ]
Summary Ocean Realtime Search Realtime Search
Jason Rutherglen added a comment - 01/Apr/09 10:48 PM - edited
The patch includes RealtimeIndex a basic class for performing atomic
transactional realtime indexing and search. A single thread
periodically flushes to disk the ram index. It relies on
LUCENE-1516.

We need to benchmark this, specifically 1) realtime w/ramdir
transaction 2) realtime w/queued documents transaction 3) normal
indexing. Realtime w/ramdir encodes the transaction to a
RAMDirectory which is added to the RAM writer using
IW.addIndexesNoOptimize. Option 1 may be slower than option 2,
however if the system is replicating it may be the only option?

Long term I believe we need to implement searching over the
IndexWriter ram buffer (if possible). However I am not sure how
option 1 and replication would work with it?


Jason Rutherglen made changes - 01/Apr/09 10:48 PM
Attachment LUCENE-1313.patch [ 12404393 ]
Michael McCandless added a comment - 02/Apr/09 08:53 AM
Jason, your last patch looks like it's taking the "flush first to RAM Dir" approach I just described as the next step (on the java-dev thread), right? Which is great!

So this has no external dependencies, right? And it simply layers on top of LUCENE-1516.

I'd be very interested to compare (benchmark) this approach vs solely LUCENE-1516.

Could we change this class so that instead of taking a Transaction object, holding adds & deletes, it simply mirrors IndexWriter's API? Ie, I'd like to decouple the performance optimization of "let's flush small segments ithrough a RAMDir first" from the transactional semantics of "I process a transaction atomically, and lock out other thread's transactions". Ie, the transactional restriction could/should layer on top of this performance optimization for near-realtime search?


Jason Rutherglen added a comment - 06/Apr/09 05:25 PM

So this has no external dependencies, right?

Yes.

I'd be very interested to compare (benchmark) this approach
vs solely LUCENE-1516.

Is the .alg using the NearRealtimeReader from LUCENE-1516 our
best measure of realtime performance?

the transactional restriction could/should layer on
top of this performance optimization for near-realtime search?

The transactional system should be able to support both methods.
Perhaps a non-locking setting would allow the same RealtimeIndex
class support both modes of operation?


Jason Rutherglen added a comment - 06/Apr/09 10:11 PM
We'll need to integrate the RAM based indexer into IndexWriter
to carry over the deletes to the ram index while it's copied to
disk. This is similar to IndexWriter.commitMergedDeletes
carrying deletes over at the segment reader level based on a
comparison of the current reader and the cloned reader.
Otherwise there's redundant deletions to the disk index using
IW.deleteDocuments which can be unnecessarily expensive. To make
external we would need to do the delete by doc id genealogy.

Michael McCandless added a comment - 07/Apr/09 08:57 AM

> I'd be very interested to compare (benchmark) this approach
> vs solely LUCENE-1516.

Is the .alg using the NearRealtimeReader from LUCENE-1516 our
best measure of realtime performance?

So far, I think so? You get to set an update rate (delete + add) docs, eg 50 docs/sec, and a pause time between NRT reopens.

Still, it's synthetic. If you guys (LinkedIn) have a way to fold in some realism into the test, that'd be great, if only "our app ingests at X docs(MB)/sec and reopens the NRT reader X times per second" to set our ballback.

> the transactional restriction could/should layer on
> top of this performance optimization for near-realtime search?

The transactional system should be able to support both methods.
Perhaps a non-locking setting would allow the same RealtimeIndex
class support both modes of operation?

Sorry, what are both "modes" of operation?

I think there are two different "layers" here – first layer optimizes NRT by flushing small segments to RAMDir first. This seems generally useful and in theory has no impact to the API IndexWriter exposes (it's "merely" an internal optimization). The second layer adds this new Transaction object, such that N adds/deletes/commit/re-open NRT reader can be done atomically wrt other pending Transaction objects.

We'll need to integrate the RAM based indexer into IndexWriter
to carry over the deletes to the ram index while it's copied to
disk. This is similar to IndexWriter.commitMergedDeletes
carrying deletes over at the segment reader level based on a
comparison of the current reader and the cloned reader.
Otherwise there's redundant deletions to the disk index using
IW.deleteDocuments which can be unnecessarily expensive. To make
external we would need to do the delete by doc id genealogy.

Right, I think the RAMDir optimization would go directly into IW, if we can separate it out from Transaction. It could naturally derive from the existing RAMBufferSizeMB, ie if NRT forces a flush, so long as its tiny, put it into the local RAMDir instead of the actual Dir, then "deduct" that size from the allowed budget of DW's ram usage. When RAMDIr + DW exceeds RAMBufferSizeMB, we then merge all of RAMDir's segments into a "real" segment in the directory.


Jason Rutherglen added a comment - 08/Apr/09 12:22 AM
Latest realtime code, transactions are removed.
  • Needs to be benchmarked
  • There could be concurrency issues around deletes that occur
    while directories are being flushed to disk.
  • It's Java JARed to include the files and directory structure.
    The patch relies on LUCENE-1516 which if included would make the
    changes incomprehensible

Jason Rutherglen made changes - 08/Apr/09 12:22 AM
Attachment LUCENE-1313.jar [ 12404907 ]
Jason Rutherglen added a comment - 08/Apr/09 09:43 PM

Still, it's synthetic. If you guys (LinkedIn) have a way
to fold in some realism into the test, that'd be great, if only
"our app ingests at X docs(MB)/sec and reopens the NRT reader X
times per second" to set our ballback.

The test we need to progress to is running the indexing side
endlessly while also reopening every X seconds, then
concurrently running searches. This way we can play with a bunch
of settings (mergescheduler threads, merge factors, max merge
docs, etc), use the python code to generate a dozen cases,
execute them and find out what seems optimal for our corpus.
It's a bit of work but probably the only way each Lucene user
can conclusively say they have the optimal settings needed for
their system. Usually there is a baseline QPS that is desired,
where the reopen delay may be increased to accommodate a lack of
QPS.

The ram dir portion of the NRT indexing increases in speed when
more threads are allocated but those compete with search
threads, another issue to keep in mind.

It might be good to add some default charting to
contrib/benchmark?


Michael McCandless added a comment - 09/Apr/09 09:19 AM

The test we need to progress to is running the indexing side
endlessly while also reopening every X seconds, then
concurrently running searches

Do you have a sense of what we'd need to add to contrib/benchmark to make this test possible? LUCENE-1516 takes the first baby step (adds a "NearRealtimeReaderTask").

Usually there is a baseline QPS that is desired,
where the reopen delay may be increased to accommodate a lack of
QPS.

Right – that's the point I made on java-dev about the "freedom" we have wrt NRT's performance.

The ram dir portion of the NRT indexing increases in speed when
more threads are allocated but those compete with search
threads, another issue to keep in mind.

Well, single threaded indexing speed is also improved by using RAM dir. Ie the use of RAM dir is orthogonal to the app's use of threads for indexing?

It might be good to add some default charting to
contrib/benchmark?

I've switched to Google's visualization API (http://code.google.com/apis/visualization/) which is a delight (they have a simple-to-use Python wrapper). It'd be awesome to somehow get simple charting folded into benchmark... maybe start w/ shear data export (as tab/comma delimited line file), and then have a separate step that slurps that data in and makes a [Google vis] chart.


Jason Rutherglen added a comment - 17/Apr/09 08:27 PM
I added an IndexWriter.getRAMIndex method that returns a
RAMIndex object that can be updated and flushed to the
underlying writer. I think this is better than adding more
methods to IndexWriter and it separates out the logic of the RAM
based near realtime index and the rest of IW.

Package protected IW.addIndexesNoOptimize(DirectoryIndexReader[]
readers) is added which is used by RAMIndex.flush. I thought
this functionality could work for LUCENE-1589 as a public
method, however because of the way IndexWriter performs merges
using segment infos, handling generic IndexReader classes (which
may not use segmentinfos) would then be difficult in the
addIndexesNoOptimize case.

I think RAMIndex.flush to the underlying writer is not
synchronized. If the IW is using ConcurrentMergeScheduler then
the heavy lifting is performed in the background and so should
not delay adding more documents to the RAMIndex.

IW.getReader returns the normal IW reader and the RAMIndex
reader if there is one.

The RAMIndex writer can be obtained and modified directly as
opposed to duplicating the setter methods of IndexWriter such as
setMergeScheduler.


Jason Rutherglen made changes - 17/Apr/09 08:27 PM
Attachment LUCENE-1313.patch [ 12405804 ]
Jason Rutherglen added a comment - 20/Apr/09 09:09 PM
  • The RAMIndex deletes approach changed to be like IndexWriter.
    The deletes are queued in lists, then applied on RI.flush.
  • There is redundancy between IW.delete* and RI.delete*, perhaps
    we don't need RI.delete*?
  • We need more multithreaded tests, probably based on
    TestIndexWriter to see if we can trigger issues in regards to
    deletes that occur while RI is calling IW.addIndexesNoOptimize.
  • If RI.delete* is removed, do we need a separate RAMIndex class
    to add documents to or is there a more transparent way for NRT
    ramdir to work? Perhaps we can add an IW.flushToRamDir (whereas
    IW.flush writes to the IW directory) method that flushes the
    rambuffer to the RAMIndex? Some of the the issues are around
    swapping out the RAMDir once it's segments are flushed to IW. If
    we took this approach would we need a IW.getReaderRAM method
    that instead of flushing to disk flushes to the ramdir? The
    other problem with the IW.flushToRamDir system is the loss of
    concurrency where a large rambuffer may be flushing to disk
    while the user really wants to small incremental NRT RI based
    updates at the same time.

Jason Rutherglen made changes - 20/Apr/09 09:09 PM
Attachment LUCENE-1313.patch [ 12405961 ]
Jason Rutherglen added a comment - 24/Apr/09 07:31 PM
For this patch I'm debating whether to add a package protected
IndexWriter.addIndexWriter method. The problem is, the RAMIndex
blocks on the write to disk during IW.addIndexesNoOptimize which
if we're using ConcurrentMergeScheduler shouldn't happen?
Meaning in this proposed solution, if segments keep on piling up
in RAMIndex, we simply move them over to the disk IW which will
in the background take care of merging them away and to disk.

I don't think it's necessary to immediately write ram segments
to disk (like the current patch does), instead it's possible to
simply copy segments over from the incoming IW, leave them in
RAM and they can be merged to disk as necessary? Then on
IW.flush any segmentinfo(s) that are not from the current
directory can be flushed to disk?

Just thinking out loud about this.


Michael McCandless added a comment - 24/Apr/09 08:03 PM

I don't think it's necessary to immediately write ram segments to disk

I agree: it should be fine from IndexWriter's standpoint if some
segments live in a private RAMDir and others live in the "real" dir.

In fact, early versions of LUCENE-843 did exactly this: IW's RAM
buffer is not as efficient as a written segment, and so you can gain
some RAM efficiency by flushing first to RAM and then merging to disk.

I think we could adopt a simple criteria: you flush the new segment to
the RAM Dir if net RAM used is < maxRamBufferSizeMB. This way no
further configuration is needed. On auto-flush triggering you then
must take into account the RAM usage by this RAM Dir. On commit,
these RAM segments must be migrated to the real dir (preferably by
forcing a merge, somehow).

A near realtime reader would also happily mix "real" Dir and RAMDir
SegmentReaders.

This should work well I think, and should not require a separate
RAMIndex class, and won't block things when the RAM segments are
migrated to disk by CMS.


Jason Rutherglen added a comment - 24/Apr/09 09:52 PM

I think we could adopt a simple criteria: you flush the
new segment to the RAM Dir if net RAM used is <
maxRamBufferSizeMB. This way no further configuration is needed.
On auto-flush triggering you then must take into account the RAM
usage by this RAM Dir.

So we're ok with the blocking that occurs when the ram buffer is
flushed to the ramdir?

On commit, these RAM segments must be migrated to the
real dir (preferably by forcing a merge, somehow).

This is pretty much like resolveExternalSegments which would be
called in prepareCommit? This could make calls to commit much
more time consuming. It may be confusing to the user why
IW.flush doesn't copy the ram segments to disk.

A near realtime reader would also happily mix "real" Dir
and RAMDir SegmentReaders.

Agreed, however the IW.getReader MultiSegmentReader removes
readers from another directory so we'd need to add a new
attribute to segmentinfo that marks it as ok for inclusion in
the MSR?


Michael McCandless added a comment - 24/Apr/09 10:13 PM

So we're ok with the blocking that occurs when the ram buffer is
flushed to the ramdir?

Well... we don't have a choice (unless/until we implement IndexReader impl to directly search the RAM buffer). Still, this should be a good improvement over the blocking when flushing to a real dir.

This is pretty much like resolveExternalSegments which would be
called in prepareCommit? This could make calls to commit much
more time consuming. It may be confusing to the user why
IW.flush doesn't copy the ram segments to disk.

Similar... the difference is I'd prefer to do a merge of the RAM segments vs the straight one-for-one copy that resolveExternalSegments does.

commit would only become more time consuming in the NRT case? IE we'd only flush-to-RAMdir if it's getReader that's forcing the flush? In which case, I think it's fine that commit gets more costly. Also, I wouldn't expect it to be much more costly: we are doing an in-memory merge of N segments, writing one segment to the "real" directory. Vs writing each tiny segment as a real one. In fact, commit could get cheaper (when compared to not making this change) since there are fewer new files to fsync.

Agreed, however the IW.getReader MultiSegmentReader removes
readers from another directory so we'd need to add a new
attribute to segmentinfo that marks it as ok for inclusion in
the MSR?

Or, fix that filtering to also accept IndexWriter's RAMDir.


Jason Rutherglen added a comment - 24/Apr/09 10:48 PM
I'm confused as to how we make DocumentsWriter switch from
writing to disk vs the ramdir? It seems like a fairly major
change to the system? One that's hard to switch later on after
IW is instantiated? Perhaps the IW.addWriter method is easier in
this regard?

the difference is I'd prefer to do a merge of the RAM
segments vs the straight one-for-one copy that
resolveExternalSegments does.

Yeah I implemented it this way in the IW.addWriter code. I agree
it's better for IW.commit to copy all the ramdir segments to one
disk segment.

I started working on the IW.addWriter(IndexWriter, boolean
removeFrom) where removeFrom removes the segments that have been
copied to the destination writer from the source writer. This
method gets around the issue of blocking because potentially
several writers could concurrently be copied to the destination
writer. The only issue at this point is how the destination
writer obtains segmentreaders from source readers when they're
in the other writers' pool? Maybe the SegmentInfo can have a
reference to the writer it originated in? That way we can easily
access the right reader pool when we need it?


Michael McCandless added a comment - 25/Apr/09 10:02 AM

I'm confused as to how we make DocumentsWriter switch from
writing to disk vs the ramdir? It seems like a fairly major
change to the system? One that's hard to switch later on after
IW is instantiated? Perhaps the IW.addWriter method is easier in
this regard?

When we create SegmentWriteState (which is supposed to contain all
details needed to tell DW how/where to write the segment), we'd set
its directory to the RAMDir? That ought to be all that's needed
(though, it's possible some places use a private copy of the original
directory, which we should fix). DW should care less which Directory
the segment is written to...

> the difference is I'd prefer to do a merge of the RAM
> segments vs the straight one-for-one copy that
> resolveExternalSegments does.

Yeah I implemented it this way in the IW.addWriter code. I agree
it's better for IW.commit to copy all the ramdir segments to one
disk segment.

OK. Maybe we modify resolveExternalSegments to accept a "doMerge"?

I started working on the IW.addWriter(IndexWriter, boolean
removeFrom) where removeFrom removes the segments that have been
copied to the destination writer from the source writer. This
method gets around the issue of blocking because potentially
several writers could concurrently be copied to the destination
writer. The only issue at this point is how the destination
writer obtains segmentreaders from source readers when they're
in the other writers' pool? Maybe the SegmentInfo can have a
reference to the writer it originated in? That way we can easily
access the right reader pool when we need it?

I don't think we need two writers? I think one writer, sometimes
flushing to RAMDir, is a clean solution?


Michael McCandless added a comment - 25/Apr/09 11:39 AM
One separate optimization we should make with NRT is to not close the doc store (stored fields, term vector) files when flushing for an NRT reader.

We do close them now, which then makes merging quite a bit more costly.

The trickiness with this optimization is we'd need to be able to somehow share an IndexInput & IndexOutput; or, perhaps we can open an IndexInput even though an IndexOutput has the same file open (Windows may prevent this, though I think I've seen that it will in fact allow it).

Once we do that optimization, then with this RAMDir optimization we should try to have the doc store files punch straight through to the real directory, ie bypass the RAMDir. The doc stores are space consuming, and since with autoCommit=false we can bypass merging them, it makes no sense to store them in the RAMDir.

We should probably do this optimization as a "phase 2", after this one.


Jason Rutherglen added a comment - 27/Apr/09 07:37 PM

When we create SegmentWriteState (which is supposed to
contain all details needed to tell DW how/where to write the
segment), we'd set its directory to the RAMDir? That ought to be
all that's needed (though, it's possible some places use a
private copy of the original directory, which we should fix). DW
should care less which Directory the segment is written to...

Agreed that DW can write the segment to the RAMDir. I started
coding along these lines however what do we do about the RAMDir
merging? This is why I was thinking we'll need a separate IW?
Otherwise the ram segments (if they are treated the same as disk
segments) would quickly be merged to disk? Or we have two
separate merging paths?

If we have a disk IW and ram IW, I'm not sure how the docstores
to disk part would work though I'm sure there's some way to do
it.

modify resolveExternalSegments to accept a "doMerge"?

Sounds good.


Jason Rutherglen added a comment - 27/Apr/09 07:50 PM

we should make with NRT is to not close the doc store
(stored fields, term vector) files when flushing for an NRT
reader.

Agreed, I think this feature is a must otherwise we're doing
unnecessary in ram merging.

we'd need to be able to somehow share an IndexInput &
IndexOutput; or, perhaps we can open an IndexInput even though
an IndexOutput

I ran into problems with this before, I was trying to reuse
Directory to write a transaction log. It seemed theoretically
doable however it didn't work in practice. It could have been
the seeking and replacing but I don't remember. FSIndexOutput
uses a writeable RAF and FSIndexInput is read only why would
there be an issue?


Jason Rutherglen added a comment - 27/Apr/09 07:59 PM

doc store files punch straight through to the real
directory

To implement this functionality in parallel (and perhaps make
the overall patch cleaner), writing doc stores directly to a
separate directory can be a different patch? There can be an
option IW.setDocStoresDirectory(Directory) that the patch
implements? Then some unit tests that are separate from the near
realtime portion.


Yonik Seeley made changes - 27/Apr/09 08:49 PM
Link This issue depends upon LUCENE-1618 [ LUCENE-1618 ]
Michael McCandless added a comment - 27/Apr/09 08:56 PM

Agreed that DW can write the segment to the RAMDir. I started
coding along these lines however what do we do about the RAMDir
merging? This is why I was thinking we'll need a separate IW?
Otherwise the ram segments (if they are treated the same as disk
segments) would quickly be merged to disk? Or we have two
separate merging paths?

Hmm, right. We could exclude RAMDir segments from consideration by
MergePolicy? Alternatively, we could "expect" the MergePolicy to
recognize this and be smart about choosing merges (ie don't mix
merges)?

EG we do in fact want some merging of the RAM segments if they get too
numerous (since that will impact search performance).

> we should make with NRT is to not close the doc store
> (stored fields, term vector) files when flushing for an NRT
> reader.

Agreed, I think this feature is a must otherwise we're doing
unnecessary in ram merging.

OK, let's do this as a separate issue/optimization for NRT. There are
two separate parts to it:

  • Ability to store doc stores in "real" directory (looks like you
    opened LUCENE-1618 for this part).
  • Ability to "share" IndexOutput & IndexInput

I ran into problems with this before, I was trying to reuse
Directory to write a transaction log. It seemed theoretically
doable however it didn't work in practice. It could have been
the seeking and replacing but I don't remember. FSIndexOutput
uses a writeable RAF and FSIndexInput is read only why would
there be an issue?

Hmm... seems like we need to investigate further. We could either
"ask" an IndexOutput for its IndexInput (sharing the underlying RAF),
or try to separately open an IndexInput (which may not work on
Windows).

To implement this functionality in parallel (and perhaps make
the overall patch cleaner), writing doc stores directly to a
separate directory can be a different patch? There can be an
option IW.setDocStoresDirectory(Directory) that the patch
implements? Then some unit tests that are separate from the near
realtime portion.

Yes, separate issue (LUCENE-1618).


Michael McCandless added a comment - 27/Apr/09 09:09 PM
So let's leave this issue focused on sometimes using RAMDir for newly created segments.

Jason Rutherglen added a comment - 27/Apr/09 09:20 PM

We could exclude RAMDir segments from consideration by
MergePolicy? Alternatively, we could "expect" the MergePolicy to
recognize this and be smart about choosing merges (ie don't mix
merges)?

Is this over complicating things? Sometimes we want a mixture of
RAMDir segments and FSDir segments to merge (when we've decided
we have too much in ram), sometimes we don't (when we just want
the ram segments to merge). I'm still a little confused as to
why having a wrapper class that manages a disk writer and a ram
writer isn't cleaner?


Michael McCandless added a comment - 27/Apr/09 10:16 PM

Sometimes we want a mixture of
RAMDir segments and FSDir segments to merge (when we've decided
we have too much in ram),

I don't think we want to mix RAM & disk merging?

EG when RAM is full, we want to quickly flush it to disk as a single
segment. Merging with disk segments only makes that flush slower?

I'm still a little confused as to
why having a wrapper class that manages a disk writer and a ram
writer isn't cleaner?

This is functionally the same as not mixing RAM vs disk merging,
right (ie just as "clean")?


Michael McCandless added a comment - 28/Apr/09 03:46 PM
Yonik raised a good question on LUCENE-1618, which is what gains do we really expect to see by using RAMDir for the tiny recently flushed segments?

It would be nice if we could approximately measure this before putting more work into this issue – if the gains are not "decent" this optimization may not be worthwhile.

Of course, we are talking about 100s of milliseconds for the turnaround time to add docs & open an NRT reader, so if the time for writing/opening many tiny files in RAMDir vs FSDir differs by say 10s of msecs then we should pursue this. We should also consider that the IO system may very well be quite busy (doing merge(s), backups, etc.) and that'd make it slower to have to create tiny files.

A simpler optimization might be to allow using CFS for tiny files (even when CFS is turned off), but built the CFS in RAM (ie, write tiny files first to RAMFiles, then make the CFS file on disk). That might get most of the gains since the FSDir sees only one file created per tiny segment, not N.


Yonik Seeley added a comment - 28/Apr/09 04:02 PM

Yonik raised a good question on LUCENE-1618, which is what gains do we really expect to see by using RAMDir for the tiny recently flushed segments?

I raised it more because of the direction the discussion was veering (write through caching to a RAMDirectory, and RAMDirectory being faster to search). I do believe that RAMDirectory can probably improve NRT, but it would be due to avoiding waiting for file open/write/close/open/read (as Mike also said)... and not any difference during IndexSearcher.search(), which should be irrelevant due to the relative size differences of the RAMDirectory and the FSDirectory. Small file creation speeds will also be heavily dependent on the exact file system used.


Jason Rutherglen added a comment - 28/Apr/09 10:05 PM

EG when RAM is full, we want to quickly flush it to disk
as a single segment. Merging with disk segments only makes that
flush slower?

I assume it's ok for the IW.mergescheduler to be used which may
not immediately perform the merge to disk (in the case of
ConcurrentMergeScheduler)? When implementing using
addIndexesNoOptimize (which blocks) I realized we probably don't
want blocking to occur because that means shutting down the
updates.

Also a random thought, it seems like ConcurrentMergeScheduler
works great for RAMDir merging, how does it compare with
SerialMS on an FSDirectory? It seems like it shouldn'y be too much
faster given the IO sequential access bottleneck?


Michael McCandless added a comment - 29/Apr/09 10:03 AM

I assume it's ok for the IW.mergescheduler to be used which may
not immediately perform the merge to disk (in the case of
ConcurrentMergeScheduler)?

Only if we "accept" requiring MergePolicy to be aware that some
segments are in RAMDir and some are in the "real" Dir and to "act
accordingly", ie 1) don't mix the dirs when merging, 2) when RAM is
"full" merge every single RAM segment into a single "real Dir" segment
(requires IW to provide exposure on how much RAM DW's buffer is
currently consuming), 3) properly "maintain" the RAM segments (ie,
merge RAM -> RAM somehow) so that searchers don't search too many RAM
segments.

I think this approach is probably best: you're right that allowing CMS
to manage these RAM segments is nice since it'll happen in the BG and
will not block updates.

It does mean, though, that the RAM usage semantics of IW is no longer
so "crisp" as flushing today ("once RAM is full, stop world & flush it
to disk, then resume") but I think that's acceptable and perhaps
preferable since world is no longer stopped to flush RAM -> disk.

Though one trickiness is... if a large RAM -> RAM merge takes place,
we temporarily double the RAM consumption. I think MergePolicy simply
shouldn't do that. Ie at not point should it be merging a very large
%tg of the RAM segments. It should instead merge RAM -> disk.

This'd also mean advanced users that implement their own MergePolicy
must realize when IW is used with NRT reader that additional smarts is
recommended wrt

When implementing using
addIndexesNoOptimize (which blocks) I realized we probably don't
want blocking to occur because that means shutting down the
updates.

Right, this is one of the strong reasons to do the "internal" approach
vs "external" one.

Also a random thought, it seems like ConcurrentMergeScheduler
works great for RAMDir merging, how does it compare with
SerialMS on an FSDirectory? It seems like it shouldn'y be too much
faster given the IO sequential access bottleneck?

By far the biggest win of CMS over SMS is in the first merge, because
it does not block the further addition of docs. Thus an app can
continue indexing into RAM buffer (consuming CPU & RAM resources)
while a BG thread consumes RAM + IO resources. This is very much a
win.

Beyond the first merge...in theory, modern IO systems have concurrency
(eg the NCQ in a single SATA drive) so you should "gain" by having
several threads performing IO at once. The OS & hard drives attempt
to re-order the request in a more optimal way (like an elevator,
sweeping floors). I haven't explictly tested this with Lucene...

I believe SSDs handle concurrent requests very well since under the
hood most of them are multi-channel basically RAID0 devices (eg Intel
X25M has 10 channels).


Jason Rutherglen added a comment - 29/Apr/09 09:02 PM - edited
A merge policy may be more optimal for ram segments vs disk
segments. Perhaps the best way to make this clean is to keep the
ram merge policy and real dir merge policies different? That way
we don't merge policy implementations don't need to worry about
ram and non-ram dir cases.

Perhaps an IW.updatePendingRamMerges method should be added that
handles this separately? Does the ram dir ever need to worry
about things like maxNumSegmentsOptimize and optimize?

Right, this is one of the strong reasons to do the
"internal" approach vs "external" one.

I think having the ram merge policy should cover the reasons I
had for having a separate ram writer. Although the IW.addWriter
method I implemented would not have blocked, but I don't think
it's necessary now if we have a separate ram merge policy.


Michael McCandless added a comment - 30/Apr/09 01:17 PM

Perhaps the best way to make this clean is to keep the
ram merge policy and real dir merge policies different? That way
we don't merge policy implementations don't need to worry about
ram and non-ram dir cases.

OK tentatively this feels like a good approach. Would you re-use
MergePolicy, or make a new RAMMergePolicy?

Would we use the same MergeScheduler to then execute the selected
merges?

How would we handle the "it's time to flush some RAM to disk" case?
Would RAMMergePolicy make that decision?

Perhaps an IW.updatePendingRamMerges method should be added that handles this separately?

Yes?

Does the ram dir ever need to worry about things like maxNumSegmentsOptimize and optimize?

No?

I think having the ram merge policy should cover the reasons I
had for having a separate ram writer. Although the IW.addWriter
method I implemented would not have blocked, but I don't think
it's necessary now if we have a separate ram merge policy.

OK good.


Jason Rutherglen added a comment - 30/Apr/09 08:25 PM

Would you re-use MergePolicy, or make a new
RAMMergePolicy?

MergePolicy is used as is with a special IW method that handles
merging ram segments for the real directory (which has an issue
around merging contiguous segments, can that be relaxed in this
case as I don't understand why this is?)

The patch is not committable, however I am posting it to show a
path that seems to work. It includes test cases for merging in
ram and merging to the real directory.

  • IW.getFlushDirectory is used by internal calls to obtain the
    directory to flush segments to. This is used in DocumentsWriter
    related calls.
  • DocumentsWriter.directory is removed so that methods requiring
    the directory call IW.getFlushDirectory instead.
  • IW.setRAMDirectory sets the ram directory to be used.
  • IW.setRAMMergePolicy sets the merge policy to be used for
    merging segments on the ram dir.
  • In IW.updatePendingMerges totalRamUsed is the size of the ram
    segments + the ram buffer used. If totalRamUsed exceeds the max
    ram buffer size then IW. updatePendingRamMergesToRealDir is
    called.
  • IW. updatePendingRamMergesToRealDir registers a merge of the
    ram segments to the real directory (currently causes a
    non-contiguous segments exception)
  • MergePolicy.OneMerge has a directory attribute used when
    building the merge.info in _mergeInit.
  • Test case includes testMergeInRam, testMergeToDisk,
    testMergeRamExceeded

There is one error that occurs regularly in testMergeRamExceeded

MergePolicy selected non-contiguous segments to merge
(_bo:cx83 _bm:cx4 _bn:cx2 _bl:cx1->_bj _bp:cx1->_bp _bq:cx1->_bp
_c2:cx1->_c2 _c3:cx1->_c2 _c4:cx1->_c2 vs _5x:c120 _6a:c8
_6t:c11 _bo:cx83** _bm:cx4** _bn:cx2** _bl:cx1->_bj**
_bp:cx1->_bp** _bq:cx1->_bp** _c1:c10 _c2:cx1->_c2**
_c3:cx1->_c2** _c4:cx1->_c2**), which IndexWriter (currently)
cannot handle

Jason Rutherglen made changes - 30/Apr/09 08:25 PM
Attachment LUCENE-1313.patch [ 12406953 ]
Jason Rutherglen added a comment - 30/Apr/09 09:59 PM
  • Ok, fixed the ensureContiguousMerge exception by asking the
    mergePolicy (not ramMergePolicy) to evaluate the ram segment
    infos as an optimize to directory. Now all the current tests
    pass.
  • The patch is cleaned up a little, needs more, and further test
    cases.
  • IndexWriter doesn't accept setRAMDirectory anymore, it needs to
    be passed into the IndexWriter constructor. This because we
    can't run the system and the ram dir is changed in the middle of
    an operation.

Jason Rutherglen made changes - 30/Apr/09 09:59 PM
Attachment LUCENE-1313.patch [ 12406961 ]
Jason Rutherglen added a comment - 30/Apr/09 11:19 PM
Fixed and cleaned up more.

All tests pass

Added entry in CHANGES.txt

I'm going to integrate LUCENE-1618 and test that out as a part of the next patch.


Jason Rutherglen made changes - 30/Apr/09 11:19 PM
Attachment LUCENE-1313.patch [ 12406973 ]
Michael McCandless added a comment - 01/May/09 01:57 PM
Patch looks good! Some comments:
  • I don't think the caller should provide the RAMDir... and we
    should have a getter to get it. I think this should be
    under-the-hood. This is simply a nice way for IW to use RAM as
    buffer in the presence of frequent NRT readers being opened.
  • If NRT is never used, the behavior of IW should be unchanged
    (which is not the case w/ this patch I think). RAMDir should be
    created the first time a flush is done due to NRT creation.
  • StoredFieldsWriter & TermVectorsTermsWriter now writes to
    IndexWriter.getFlushDirectory(), which is confusing because that
    method returns the RAMDir if set? Shouldn't this be the opposite?
    (Ie it should flush to IndexWriter.getDirectory()? Or we should
    change getFlushDiretory to NOT return the ramdir?)
  • Why did you need to add synchronized to some of the SegmentInfo
    files methods? (What breaks if you undo that?). The contract
    here is IW protects access to SegmentInfo/s.
  • The MergePolicy needs some smarts when it's dealing w/ RAM. EG it
    should not do a merge of more than XXX% of total RAM usage (should
    flush to the real directory instead).
  • Nothing is calling the new ramOverLimit?
  • Still some noise (MockRAMDir, DocFieldProcessorPerThread, some
    changes in LogMergePolicy)

Jason Rutherglen added a comment - 01/May/09 06:38 PM
  • IndexFileDeleter takes into account the ram directory (which
    when using NRT with the FSD caused files to not be found).
  • FSD is included and writes fdx, fdt, tvx, tvf, tvd extension
    files to the primary directory (which is the same as
    IW.directory). LUCENE-1618 needs to be updated with these
    changes (or we simply include it in this patch as the
    LUCENE-1618 patch is only a couple of files).
  • Removed DocumentsWriter.ramOverLimit
  • I think we need to give the option of a ram mergescheduler
    because the user may want not want the ram merging and disk
    merging to compete for threads. I'm thinking if of the use case
    where NRT is a priority then one may allocate more threads to
    the ram CMS and less to the disk CMS. This also gives us the
    option of trying out more parameters when performing benchmarks
    of NRT.
  • We may want to default the ram mergepolicy to not use compound
    files as it's not useful when using a ram dir?
  • Because FSD uses IW.directory, FSD will list files that
    originated from FSD and from IW.directory, we may want to keep
    track of which files are supposed to be in FSD (from the
    underlying primary dir) and which are not?

If NRT is never used, the behavior of IW should be
unchanged (which is not the case w/ this patch I think). RAMDir
should be created the first time a flush is done due to NRT
creation.

In the patch if ramdir is not passed in, the behavior of IW
remains the same as it is today. You're saying we should have IW
create the ramdir by default after getReader is called and
remove the IW ramdir constructor? What if the user has an
alternative ramdir implementation they want to use?

StoredFieldsWriter & TermVectorsTermsWriter now writes to
IndexWriter.getFlushDirectory(), which is confusing because that
method returns the RAMDir if set? Shouldn't this be the
opposite? (Ie it should flush to IndexWriter.getDirectory()? Or
we should change getFlushDiretory to NOT return the
ramdir?)

The attached patch uses FileSwitchDirectory, where these files
are written to the primary directory (IW.directory). So
getFlushDirectory is ok?

Why did you need to add synchronized to some of the
SegmentInfo files methods? (What breaks if you undo that?). The
contract here is IW protects access to SegmentInfo/s

SegmentInfo.files was being cleared while sizeInBytes was called
which resulted in an NPE. The alternative is sync IW in
IW.size(SegmentInfos) which seems a bit extreme just to obtain
the size of a segment info?

The MergePolicy needs some smarts when it's dealing w/
RAM. EG it should not do a merge of more than XXX% of total RAM
usage (should flush to the real directory instead)

Isn't this handled well enough in updatePendingMerges or is
there more that needs to be done?


Jason Rutherglen made changes - 01/May/09 06:38 PM
Attachment LUCENE-1313.patch [ 12407030 ]
Michael McCandless added a comment - 02/May/09 10:51 AM

IndexFileDeleter takes into account the ram directory (which
when using NRT with the FSD caused files to not be found).

I don't like how "deep" the dichotomy of "RAMDir vs FSDir" is being
pushed. Why can't we push FSD down to all these places (IFD,
SegmentInfo/s, etc.)?

FSD is included and writes fdx, fdt, tvx, tvf, tvd extension
files to the primary directory (which is the same as
IW.directory). LUCENE-1618 needs to be updated with these
changes (or we simply include it in this patch as the
LUCENE-1618 patch is only a couple of files).

Why did this require changes to FSD?

I think we need to give the option of a ram mergescheduler
because the user may want not want the ram merging and disk
merging to compete for threads. I'm thinking if of the use case
where NRT is a priority then one may allocate more threads to
the ram CMS and less to the disk CMS. This also gives us the
option of trying out more parameters when performing benchmarks
of NRT.

I think we're unlikely to gain from more than 1 BG thread for RAM
merging? But I agree it'd be horrible if CMS blocked RAM merging
because its allotted threads were tied up merging disk segments.
Could we simply make the single CMS instance smart enoguh to realize
that a single RAM merge is allowed to proceed regardless of the thread
limit?

We may want to default the ram mergepolicy to not use compound
files as it's not useful when using a ram dir?

I think actually hardwire it, not just default. Building CFS in RAM
makes no sense. Worse, if we allow one to choose to do it we then
have to fix FSD to understand CFX must go to the dir too, and, we'd
have to fix IW to not merge in the doc store files when building a
private CFS. Net/net I think we should not allow CFS for the RAM
segments.

On merging to disk it can then respect the user's CFS setting.

Because FSD uses IW.directory, FSD will list files that
originated from FSD and from IW.directory, we may want to keep
track of which files are supposed to be in FSD (from the
underlying primary dir) and which are not?

I don't understand what's wrong here?

> If NRT is never used, the behavior of IW should be
> unchanged (which is not the case w/ this patch I think). RAMDir
> should be created the first time a flush is done due to NRT
> creation.

In the patch if ramdir is not passed in, the behavior of IW
remains the same as it is today. You're saying we should have IW
create the ramdir by default after getReader is called and
remove the IW ramdir constructor?

Right. This should be "under the hood".

What if the user has an alternative ramdir implementation they want to
use?

I think I'd rather not open up that option just yet. This really is a
private optimization to how IW uses RAM. We may want to further
change/improve how RAM is used.

Way back when, IW used a RAMDir internally for buffering; then, with
LUCENE-843 we switched to whole different format (DW's ram
buffering). Now we are adding back RAMDir for NRT; maybe we'll switch
its format at some point... or change NRT to directly search DW's
RAM... etc. How IW uses RAM is very much an internal detail so I'd
rather not expose it publically.

[BTW: once we have this machinery online, it's conceivable that we'd
want to flush to RAMDir even in the non-NRT case. EG, say DW's RAM
buffer is full and it's time to flush. If it flushes to RAM,
typically the RAMDir is far more compact than DW's RAM buffer and it
then still has some more space to work with, before having to flush to
disk. If we explore this it should be in a new issue (later)...]

> StoredFieldsWriter & TermVectorsTermsWriter now writes to
> IndexWriter.getFlushDirectory(), which is confusing because that
> method returns the RAMDir if set? Shouldn't this be the
> opposite? (Ie it should flush to IndexWriter.getDirectory()? Or
> we should change getFlushDiretory to NOT return the
> ramdir?)

The attached patch uses FileSwitchDirectory, where these files
are written to the primary directory (IW.directory). So
getFlushDirectory is ok?

OK, though I'd like to simply always use FSD, even if primary &
secondary are the same dir. All these if's checking for both dirs,
passing both dirs deep into Lucene's APIs, etc., are spooky.

> Why did you need to add synchronized to some of the
> SegmentInfo files methods? (What breaks if you undo that?). The
> contract here is IW protects access to SegmentInfo/s

SegmentInfo.files was being cleared while sizeInBytes was called
which resulted in an NPE. The alternative is sync IW in
IW.size(SegmentInfos) which seems a bit extreme just to obtain
the size of a segment info?

But... why did we have one thread asking for size while another was
tweaking the SegmentInfo? What leads to that? We need to better
understand the root cause here.

The size consumed by the RAM segments should be carefully computed
(called only in sychchronized(iw) context) and then shared. This value
changes relatively rarely (on flushing a new segment to ram; on
applying deletes that include RAM segments; on doing a ram->ram
merge), but is read frequently (per doc added, to decide whether it's
time to flush). I think the value should be pushed to DW whenever it
changes, via synchronized method in DW; and then the existing
synchronized logic in DW that decides if it's time to "flush after"
should consult that value. No further synchronizing should be
necessary.

Also, this ram size should be used not only for deciding when it's
time to merge to a disk segment, but also when it's time for DW to
flush a new segment (which I think your current patch is missing?).

> The MergePolicy needs some smarts when it's dealing w/ RAM. EG it
> should not do a merge of more than XXX% of total RAM usage (should
> flush to the real directory instead).

Isn't this handled well enough in updatePendingMerges or is
there more that needs to be done?

There is more that needs to be done, because MergePolicy must
conditionalize its logic based on RAM vs FS. Ie, if our RAM buffer is
32 MB, and there are say 31 MB of RAM segments that suddenly need
merging (becuase we just flushed the 10th RAM segment), we should not
do a RAM -> RAM merge at that point (because 31./32. = very high pctg
of our net RAM buffer). Instead we should force RAM -> disk at that
point, even though technically RAM is not yet full.

Ooh: maybe a better approach is to disallow the merge if the expected
peak RAM usage will exceed our buffer. I like this better. So if
budget is 32 MB, and net RAM used (segments + DW) is say 22, we have a
10 MB "budget", so we are allowed to select merges that total to < 10
MB.


Jason Rutherglen added a comment - 04/May/09 06:31 PM

I don't like how "deep" the dichotomy of "RAMDir vs
FSDir"

Agreed, it's a bit awkward but I don't see another way to do
this. The good thing is if IW has written some .fdt files to the
main dir (via FSD), IW crashes, then IW is created again, IFD
automatically deletes the extraneous .fdt (and other extension)
files.

Why can't we push FSD down to all these places (IFD,
SegmentInfo/s, etc.)?

Could we simply make the single CMS instance smart enough
to realize that a single RAM merge is allowed to proceed
regardless of the thread limit?

Hmm... I think for benchmarking it would be good to allow
options as we simply don't know. In the latest patch a ram
mergescheduler can be set to the IndexWriter.

have to fix FSD to understand CFX must go to the dir
too

I think this is fixed in the patch, where compound files are not
created in RAM.

You're saying we should have IW create the ramdir by default
after getReader is called and remove the IW ramdir constructor?
Right. This should be "under the hood".

Ok, this will require some reworking of the patch.

OK, though I'd like to simply always use FSD, even if
primary & secondary are the same dir.

How will always using FSD work? Doesn't it assume writing to two
different directories?

this ram size should be used not only for deciding when
it's time to merge to a disk segment, but also when it's time
for DW to flush a new segment

In the new patch this is fixed.

So if budget is 32 MB, and net RAM used (segments + DW)
is say 22, we have a 10 MB "budget", so we are allowed to select
merges that total to < 10 MB.

One issue is the ram buffer flush doubles the ram used (because
the segment is flushed as is to the RAM dir). You're saying
roughly estimate the ram size used on the result of a merge and
have the merge policy take this into account? This makes sense,
otherwise we will consistently (if temporarily) exceed the ram
buffer size. The algorithm is fairly simple? Find segments whose
total sizes are lower than whatever we have left of the max ram
buffer size? I have new code, but will rework it a bit to
include this discussion.


Michael McCandless added a comment - 04/May/09 10:14 PM

> OK, though I'd like to simply always use FSD, even if
> primary & secondary are the same dir.

How will always using FSD work? Doesn't it assume writing to two
different directories?

I think on creating IW the user should state (via new expert ctor)
that they intend to use it for NRT (say, a new boolean
"enableNearRealTime").

Then we could pass IFD either an FSD (when in NRT mode) or the normal
directory when not in NRT mode. IFD would not longer have to
duplicate FSD's logic (summing the two dir's listAlls, the
getDirectoryForFile).

SegmentInfos.hasExternalSegments, and MultiSegmentReader ctor, should
be "smart" when they're passed an FSD (probably we should add
Directory.contains(Directory) method, which by default returns true if
this.equals(dir), but FSD would override to return true if the
incoming dir .equals primary & secondary).

Likewise all the switching in DW to handle two dirs should be rolled
back (eg you adde DW.fileLength(name, dir1, dir2) that's dup code with
FSD).

One issue is the ram buffer flush doubles the ram used (because
the segment is flushed as is to the RAM dir).

I think we must keep transient RAM usage below the specified limit, so
that limits our flushing freedom. Ie, in the NRT case, once DW's RAM
buffer exceeds half of the allowed remaining RAM budget (ie, the limit
minus total RAM segments) then we trigger a flush to RAM and then to
the "real" dir.

Or... we could flush the new segment directly to the real dir as one
segment, and merge all prior RAM segments as a separate new segment in
the main dir, if the free RAM is large enough.

> this ram size should be used not only for deciding when
> it's time to merge to a disk segment, but also when it's time
> for DW to flush a new segment

In the new patch this is fixed.

I don't see where this is taken into account? Did you mean to attach
a new patch?


Jason Rutherglen added a comment - 05/May/09 12:32 AM
  • In DocumentsWriter.balanceRAM if NRT is on the total ram
    consumed is "(numBytesUsed * 2) + writer.getRamDirSize()".
    numBytesUsed is the current consumption of the ram buffer.
    Basically what we flush to ram, we'll consume that much of the
    buffer. This is now taken into account in the bufferIsFull
    calculation.
  • Double dir usage should be factored out.
  • TestIndexWriterRamDir.testFSDirectory fails. It tries to
    simulate a crashing IW. When the IW is created again it should
    delete the old files, for some reason it's not with FSDirectory
    (open file handles on Windows perhaps)

we could flush the new segment directly to the real dir
as one segment, and merge all prior RAM segments as a separate
new segment in the main dir, if the free RAM is large enough.

Yeah it's unclear what the best policy is here. Do we want to
have some sort of custom merge policy method/class to take care
of this so the user can customize it?


Jason Rutherglen made changes - 05/May/09 12:32 AM
Attachment LUCENE-1313.patch [ 12407201 ]
Jason Rutherglen added a comment - 05/May/09 12:52 AM
Did we decide to simply add a boolean param in the ctor to turn
on NRT instead of relying on getReader. Using getReader could
cause problems with switching directories midstream.

Michael McCandless added a comment - 05/May/09 08:26 AM

Did we decide to simply add a boolean param in the ctor to turn
on NRT instead of relying on getReader. Using getReader could
cause problems with switching directories midstream.

Yes, let's switch to that.


Michael McCandless added a comment - 05/May/09 09:00 AM
  • We shouldn't add FSD.setPrimaryExtensions?
  • The dual directories is continuing to push deeper (when I'm
    wanting to do the reverse). EG, MergeScheduler.getDestinationDirs
    should not be needed?
  • We should no longer need IndexWriter.getFlushDirectory? IE, IW
    once again has a single "Directory" as seen by IFD,
    DocFieldProcessorPerThread, etc. In the NRT case, this is an FSD;
    in the non-NRT case it's the Dir that was passed in (unless, in a
    future issue, we explore using FSD, too, for better performance).
  • We can't up and change CMS.handleMergeException (breaks
    back-compat); can you deprecate old & add a new one that calls old
    one? Let's have the new one take a Throwable and
    MergePolicy.OneMerge?
  • Instead of overriding "equals" (FSD.equals) can you change to
    "Directory.contains"?
  • IW's RAMDir usage still isn't factored in properly. EG
    DW.doBalancRAM is not taking it into account.
  • Furthermore, we can't call writer.getRamDirSize() from
    DW.balanceRAM – that's far too costly. Instead, whenever RAMDir
    changes (deletes are applied, or a new RAM segment is created), we
    must push down to DW that usage with a new synchronized method.
    (I described this above). We should remove
    IW.getRamDirSize()... ie, this size should always be "pushed on
    change", not "polled on read". We change it rarely and read it
    incredibly often.
  • We don't need IW.getRamLogMergePolicy()? Instead, let's ignore
    "MergePolicy.useCompoundFile()" when we are writing the new
    segment to RAMDir? Likewise we should not cast RAMMergePolicy to
    LogMergePolicy in setRAMMergePolicy, nor turn off its CFS there.
  • I still don't think we need a separate RAMMergeScheduler; I think
    CMS should simply always run such merges (ie not block on max
    thread count). IW.getNextMerge can then revert to its former
    self.
  • MergePolicy.OneMerge.segString no longer needs to take a
    Directory (because it now stores a Directory).
  • The mergeRAMSegmentsToDisk shouldn't be fully synchronized, eg
    when doWait is true it should release the lock while merges are
    taking place.

Jason Rutherglen added a comment - 05/May/09 07:53 PM

RAMDir changes (deletes are applied, or a new RAM segment is
created), we must push down to DW that usage with a new synchronized
method.

Sounds like we create a subclass of RAMDirectory with this
functionality?

We don't need IW.getRamLogMergePolicy()?

Because we don't want the user customizing this?

We should no longer need IndexWriter.getFlushDirectory? IE, IW
once again has a single "Directory" as seen by IFD,
DocFieldProcessorPerThread, etc. In the NRT case, this is an FSD; in
the non-NRT case it's the Dir that was passed in (unless, in a future
issue, we explore using FSD, too, for better performance).

Pass in FSD in the constructor of DocumentsWriter (and others) as
before?

I still don't think we need a separate RAMMergeScheduler; I
think CMS should simply always run such merges (ie not block on max
thread count). IW.getNextMerge can then revert to its former
self.

Where does the thread come from for this if we're using max threads?
If we allocate one, we're over limit and keeping it around. We'd need
a more advanced threadpool that elastically grows the thread pool and
kills threads that are unused over time. With Java 1.5 we can use
ThreadPoolExecutor. Is a dedicated thread pool something we want to
go to? Even then we can potentially still max out a given thread pool
with requests to merge one directory or the other. We'd probably
still need two separate thread pools.

MergePolicy.OneMerge.segString no longer needs to take a
Directory (because it now stores a Directory).

Yeah, I noticed this, I'll change it. MergeSpecification.segString is
public and takes a directory that is not required. What to do?


Jason Rutherglen added a comment - 05/May/09 09:37 PM

The dual directories is continuing to push deeper (when I'm
wanting to do the reverse). EG, MergeScheduler.getDestinationDirs
should not be needed?

If we remove getFlushDirectory, are you saying getDirectory should
return the FSD if RAM NRT is turned on? This seems counter intuitive
in that we still need a clear separation of the two directories? The
user would expect the directory they passed into the ctor to be
returned?

getDestinationDirs is used by the ram merge scheduler, which if we
use a single CMS would go away.

I'm looking at how to get RAMLogMergePolicy to take into account the
size of the ram segments it's merging such that they do not total
beyond the remaining available ram. Looks like we could keep a
running byte total while it's building the merges and stop once we've
reached the limit, though I'm not sure how exact this is (will the
merges be balanced using this system?). It seems like a variation on
the LogByteSizeMergePolicy however it's unclear whether
LogDocMergePolicy or LogByteSizeMergePolicy ram merges will perform
better (does it matter since it's all in ram and we're capping the
total?)


Jason Rutherglen added a comment - 06/May/09 05:21 AM
I'm not sure we have the right model yet for deciding when to
flush the ram buffer and/or ram segments. Perhaps we can simply
divide the ram buffer size in half, allocating one part to the
ram buffer, the other to the ram segments. When one exceeds it's
(rambuffersize/2) allotment, it's flushed to disk. This way if
the ram buffer size is 32MB, we will always safely flush 16MB to
disk. The more ram allotted, greater the size of what's flushed
to disk. We may eventually want to offer an expert method to set
the ram buffer size and ram dir max size individually.

Put another way I think we need a balanced upper limit for the
ram buffer and the NRT ram dir, which seems (to me) to be hard
to achieve by allowing too much growth at the expensive of the
other.

I'd like to stay away from flushing the ram buffer to disk when
it's below say 20% of the ram buffer size as it seems
inefficient to do this (because we'll have to do an expensive
disk merge on it later). On the other hand if the user is not
calling get reader very often and we're auto flushing at 1/2 the
ram buffer size, we're short changing ourselves and only
flushing a segment half the size of what it could be. I suppose
we could stick with the 1/2 model, only turning it on once ram
segments are being merged in ram?

If when merging ram segments (using the specialized
RAMMergePolicy) we only merge in ram the ones that fit, what do
we do with the ram segments remaining that need to be flushed to
disk? What if they are only make up 20% of the total size of the
ram segments? If we merge the 20% to disk it seems inefficient?


Michael McCandless added a comment - 06/May/09 11:16 AM

> RAMDir changes (deletes are applied, or a new RAM segment is
> created), we must push down to DW that usage with a new synchronized
> method.

Sounds like we create a subclass of RAMDirectory with this
functionality?

I don't think that's needed. I think whenever IW makes a change to
the RAMDir, which is easily tracked, it pushes to DW the new RAMDir
size.

> We don't need IW.getRamLogMergePolicy()?

Because we don't want the user customizing this?

That, and because it's only used to determine CFS or not, which we've
turned off for RAMDir.

> We should no longer need IndexWriter.getFlushDirectory? IE, IW
> once again has a single "Directory" as seen by IFD,
> DocFieldProcessorPerThread, etc. In the NRT case, this is an FSD; in
> the non-NRT case it's the Dir that was passed in (unless, in a future
> issue, we explore using FSD, too, for better performance).

Pass in FSD in the constructor of DocumentsWriter (and others) as
before?

Right. All these places could care less if they are dealing w/ FSD or
a "real" dir. They should simply use the Directory API as they
previously did.

> I still don't think we need a separate RAMMergeScheduler; I
> think CMS should simply always run such merges (ie not block on max
> thread count). IW.getNextMerge can then revert to its former
> self.

Where does the thread come from for this if we're using max threads?
If we allocate one, we're over limit and keeping it around. We'd need
a more advanced threadpool that elastically grows the thread pool and
kills threads that are unused over time. With Java 1.5 we can use
ThreadPoolExecutor. Is a dedicated thread pool something we want to
go to? Even then we can potentially still max out a given thread pool
with requests to merge one directory or the other. We'd probably
still need two separate thread pools.

The thread is simply launched w/o checking maxThreadCount, if the
merge is in RAM.

Right, with JDK 1.5 we can make CMS better about pooling threads.
Right now it does no long-term pooling (unless another merge happens
to be needed when a thread finishes its last merge).

> MergePolicy.OneMerge.segString no longer needs to take a
> Directory (because it now stores a Directory).

Yeah, I noticed this, I'll change it. MergeSpecification.segString is
public and takes a directory that is not required. What to do?

Do the usual back-compat dance – deprecate it and add the new one.

> The dual directories is continuing to push deeper (when I'm
> wanting to do the reverse). EG, MergeScheduler.getDestinationDirs
> should not be needed?

If we remove getFlushDirectory, are you saying getDirectory should
return the FSD if RAM NRT is turned on? This seems counter intuitive
in that we still need a clear separation of the two directories? The
user would expect the directory they passed into the ctor to be
returned?

I agree, we should leave getDirectory() as is (returns whatever Dir
was passed in).

We can keep getFlushDirectory, but it should not have duality inside it
– it should simply return the FSD (in the NRT case) or the normal
dir. I don't really like the name getFlushDirectory... but can't
think of a better one yet.

Then, nothing outside of IW should ever know there are two directories
at play. They all simply deal with the one and only Directory that IW
hands out.

On the "when to flush to RAM" question... I agree it's tricky. This
logic belongs in the RAMMergePolicy. That policy needs to be
empowered to decide if a new flush goes to RAM or disk, to decide when
to merge all RAM segments to a new disk segment, to be able to check
if IW is in NRT mode, etc. Probably the RAM merge policy also needs
control over how much of the RAM buffer it's going to give to DW,
too. At first the policy should not change the non-NRT case (ie one
always flushes straight to disk). We can play w/ that in a separate
issue. Need to think more about the logic...


Jason Rutherglen added a comment - 06/May/09 07:13 PM

I don't think that's needed. I think whenever IW makes a
change to the RAMDir, which is easily tracked, it pushes to DW
the new RAMDir size.

Because we know the IW.ramdir is a RAMDirectory implementation,
we can use sizeInBytes? It's synchronized, maybe we want a
different method that's not? It seems like keeping track of all
files writes outside ramdir is going to be difficult? For
example when we do deletes via SegmentReader how would we keep
track of that?

That, and because it's only used to determine CFS or not,
which we've turned off for RAMDir.

So we let the user set the RAMMergePolicy but not get it?

The thread is simply launched w/o checking
maxThreadCount, if the merge is in RAM.

Hmm... We can't just create threads and let them be garbage
collected as JVMs tend to throw OOMs with this. If we go down
this route of a single CMS, maybe we can borrow some code from
an Apache project that's implemented a threadpool.


Michael McCandless added a comment - 06/May/09 09:53 PM

> I don't think that's needed. I think whenever IW makes a
> change to the RAMDir, which is easily tracked, it pushes to DW
> the new RAMDir size.

Because we know the IW.ramdir is a RAMDirectory implementation,
we can use sizeInBytes? It's synchronized, maybe we want a
different method that's not? It seems like keeping track of all
files writes outside ramdir is going to be difficult? For
example when we do deletes via SegmentReader how would we keep
track of that?

We should definitely just use the sizeInBytes() method.

I'm saying that IW knows when it writes new files to the RAMDir
(flushing deletes, flushing new segment) and it's only at those times
that it should call sizeInBytes() and push that value down to DW.

> That, and because it's only used to determine CFS or not,
> which we've turned off for RAMDir.

So we let the user set the RAMMergePolicy but not get it?

Oh, we should add a getter (getRAMMergePolicy, not getLogMergePolicy)
for it, but it should return MergePolicy not LogMergePolicy.

> The thread is simply launched w/o checking
> maxThreadCount, if the merge is in RAM.

Hmm... We can't just create threads and let them be garbage
collected as JVMs tend to throw OOMs with this. If we go down
this route of a single CMS, maybe we can borrow some code from
an Apache project that's implemented a threadpool.

This is how CMS has always been. It launches threads relatively
rarely – this shouldn't lead to OOMs. One can always subclass CMS if
this is somehow a problem. Or we could modify CMS to pool its threads
(as a new issue)?


Jason Rutherglen added a comment - 06/May/09 11:38 PM
In the patch the merge policies are split up which requires some
of the RAM NRT logic to be in updatePendingMerges.

One solution is to have a merge policy that manages merging to
ram and to disk, kind of an overarching merge policy for the
primary MP and the RAM MP. This would push the logic of ram
merging and primary dir merging to the meta merge policy which
would clean up IW from managing ram segs vs. prim segs.

Does IW.optimize and IW.expungeDeletes operate on the ramdir as
well (the expungeDeletes javadoc implies calling
IR.numDeletedDocs will return zero when there are no deletes).


Jason Rutherglen added a comment - 08/May/09 05:36 AM
Something in the DocumentsWriter API we may need to change is to
allow passing a directory through the IndexingChain. In the RAM
NRT case, which directory we write to can change depending on if
a ram buffer has exceeded it's maximum available size. If it is
under half the available ram it will to go the ram dir, if not
the new segment will be written to disk. For this reason we
can't simply pass a directory into the constructor of
DocumentsWriter, nor can we rely on calling
IW.getFlushDirectory. We should be able to rely on the directory
in SegmentWriteState?

Michael McCandless added a comment - 08/May/09 09:06 AM

Does IW.optimize and IW.expungeDeletes operate on the ramdir as
well (the expungeDeletes javadoc implies calling
IR.numDeletedDocs will return zero when there are no deletes).

I think IW.optimize should mean all RAM segments are merged into the single on-disk segment?

And IW.expungeDeletes should also apply to RAM segments, ie if RAM segments have pending deletes, they are merged away (possibly entirely in RAM, ie the RAM merge policy could simply merge to a new RAM segment)?

In the patch the merge policies are split up which requires some
of the RAM NRT logic to be in updatePendingMerges.
One solution is to have a merge policy that manages merging to
ram and to disk,

It looks like it's the "is it time to flush to disk" logic, right? Why can't we make that the purview of the RAM MergePolicy? We may need to extend MergePolicy API to tell it how much RAM is free in the budget.

We should be able to rely on the directory in SegmentWriteState?

I think we should fix the indexing chain to always use SegmentWriteState's Directory and not pass Directory to the ctors? Does something go wrong if we take that approach?


Jason Rutherglen added a comment - 11/May/09 04:21 PM

IW.optimize should mean all RAM segments are merged into
the single on-disk segment

Yes.

IW.expungeDeletes should also apply to RAM segments, ie
if RAM segments have pending deletes, they are merged away
(possibly entirely in RAM, ie the RAM merge policy could simply
merge to a new RAM segment)?

Yes.

we should fix the indexing chain to always use
SegmentWriteState's Directory and not pass Directory to the
ctors

Yep.

The next patch will have these features.


Jason Rutherglen added a comment - 12/May/09 03:20 AM
  • A single merge scheduler is used. We will need to open a new
    issue for a version of ConcurrentMergeScheduler that allocates
    threads perhaps based on the merge.directory? We'd also probably
    want to add thread pooling.
  • There's a package protected IW ctor that accepts the ram dir.
    This is used in the test case for insuring we aren't creating
    .cfs files in the ram dir.
  • IW.optimize merges all segments (ram included) to the primary
    dir
  • IW.expungeDeletes merges segments with deletes, in ram ones
    stay in ram (unless they won't fit), and primary dir ones are
    handled as usual
  • Added testOptimize, testExpungeDeletes, and some other test
    cases
  • Needs a test case to make sure we're merging to the primary
    dir when the ram dir is full or a flush won't fit in the ram dir
  • There's a mergeRamSegmentsToDir and resolveRamSegments. Two
    different methods because mergeRamSegmentsToDir operates by
    simply scheduling merges, resolveRamSegments operates in the
    foreground like resolveExternalSegments. I'm not sure if we can
    combine the two. resolveRamSegments seems to have a thread
    notification problem and so hangs at times. I'll look into this
    further unless it's obvious what the problem is.
  • When RAM NRT is on (via the IndexWriter constructor), setting
    the ram buffer size allocates half of the given number to the
    DocumentsWriter buffer and half to the ram dir. It may be best
    to dynamically change these numbers based on usage etc.
  • Added NRTMergePolicy which is used only when RAM NRT is on. It
    utilizes the regular merge policy and the ram merge policy.
  • The ram dir size is pushed to DocumentsWriter
  • RAMMergePolicy extends LogDocMergePolicy and defaults the
    useCompoundFile and useCompoundDocStore to false
  • Sorry for the whitespace stuff, I'll clean it up later, I
    wanted to post the latest to get feedback

Jason Rutherglen made changes - 12/May/09 03:20 AM
Attachment LUCENE-1313.patch [ 12407841 ]
Jason Rutherglen added a comment - 12/May/09 07:24 PM
I think the easiest way to handle the ram buf size vs. the ram
dir size is the allow each to grow on request. I have some code
I need to test that implements it. This way we're growing based
on demand and availability. The only thing we may want to add is
a way to grow and perhaps automatically flush based on the
growth requested and perhaps prioritizing requests?

Jason Rutherglen added a comment - 19/May/09 09:59 PM
  • All tests pass, added more tests
  • Added DocumentsWriter.growRamBufferBy/growRamDirMaxBy methods
    that allow dynamically requesting more ram. We start off at
    50/50, ramdir/rambuffer. Then whenever one needs more, grow* is
    called.
  • We need a RAMPolicy class that allows customizing how ram is
    allocated. Currently the ramdir and the rambuffer compete for
    space, the user will presumably want to customize this.
  • I'm not sure the flushing always occurs when it should, and
    not sure yet how to test to insure it's flushing when it should
    (other than watching a log). What happened to the adding logging
    to Lucene patch?

Jason Rutherglen made changes - 19/May/09 09:59 PM
Attachment LUCENE-1313.patch [ 12408521 ]
Jason Rutherglen made changes - 20/May/09 04:31 AM
Description Realtime search with transactional semantics.

Possible future directions:
  * Optimistic concurrency
  * Replication

Encoding each transaction into a set of bytes by writing to a RAMDirectory enables replication. It is difficult to replicate using other methods because while the document may easily be serialized, the analyzer cannot.

I think this issue can hold realtime benchmarks which include indexing and searching concurrently.
Enable near realtime search in Lucene without external
dependencies. When RAM NRT is enabled, the implementation adds a
RAMDirectory to IndexWriter. Flushes go to the ramdir unless
there is no available space. Merges are completed in the ram
dir until there is no more available ram.

IW.optimize and IW.commit flush the ramdir to the primary
directory, all other operations try to keep segments in ram
until there is no more space.
Michael McCandless added a comment - 22/May/09 01:47 PM
I think generally we are close. I have lots of little comments from
looking through the patch:
  • Can you update the CHANGES entry to something like "IndexWriter
    now uses RAM more efficiently when in near real-time mode"? (Ie
    we don't pass RAMDir to IW).
  • DW.push/getRAMDirSize, RAMTotalMax, RAMBufferAvailable, etc. need
    to be synchronized?
  • Since IW.flushDocStores always goes to the main directory, why
    does it now take a Directory arg?
  • I don't think doAfterFlush should be responsible for calling
    pushRamDirSize(); that's more of a hook for external subclasses.
  • Yes, IW.ramSizeInBytes() should include the ramDir's bytes
  • There are still places where Directory.contains should be used,
    instead of pulling both dirs and checkign each. EG, the assert in
    DW.applyDeletes, and this assert in IW:
    if (ramNrt && merge.directory == switchDirectory) {
      assert !merge.useCompoundFile;
    }

    I'd like to eliminate IW.getInternalDirectory, if possible: to
    anyone interacting with IW, there is only one Directory, and the
    switching is entirely "under the hood".

  • I realized there is in fact a benefit to using CFS in RAM: much
    better RAM efficiency for tiny segments (because RAMDir's buffer
    size is 1 KB). Though such segments would presumably be merged
    away with time, so it may not be a big deal...
  • Is IW.mergeRAMSegmentToDir only for testing?
  • Can you name things theRAMSetting instead of theRamSetting? (Ie,
    RAM is all caps).
  • For IW.resolveRAMSegments, maybe we should make a single merge
    that merges everything down? Why even bother interacting with a
    merge policy, here?
  • Can you rename flush()'s new arg "flushToRAM" to
    "allowFlushToRAM"? Ie, even when this is true, that method may
    decide RAM is full and in fact flush to the real dir.
  • Can you rename IW.ramNRT to IW.flushToRAM? (Since it's in fact
    orthogonal to NRT).
  • It's sneaky to set docWriter.flushToDir before calling
    docWriter.flush; can't we make that an arg to docWriter.flush?
    (And docWriter would never store it).
  • Why did you need to add DW.fileLength?
  • IW.SWITCH_FILE_EXTS should be private static final (not public)?
  • We lost private on a number of attrs in IW – can you restore?
    (You should insert nocommit comments when you do that, to reduce
    risk that such changes slip in).
  • Likewise for SegmentReader.coreRef.
  • Why did you need to make RAMDir.sizeInBytes volatile? Isn't it
    always updated/accessed from sync(RAMDir) context?
  • Why do we need a new class RAMMergePolicy? (There's no API
    difference over MergePolicy). Can't we simply by default
    instantiate LogByteSizeMergePolicy, and set CFS/CFX to false?
  • IW.fileSwitchDirectory should be private?
  • Have you done any perf tests with flushToRAM = true? EG should we
    enable it by default? I think if we have a good policy for
    managing RAM it could very well be higher performance. But, we
    should explore this under a different issue, so leave the default
    at "no ram dir".

On the "how to share RAM" between RAMDir & DW's RAM buffer... instead
of pre-dividing and growing over time, I think we can simplify it by
logically sharing a single "pool".

The RAMDir only alters its ram usage when 1) we flush a new segment to
it, 2) a merge completes (either writing to the real dir or to the ram
dir), or 3) deletes are applied to segments in RAM. When such a
change happens we notify DW. DW takes then adds that base into its
ram consumption to decide when it's time to flush.

For starters, and we can optimize this later, I don't think DW should
choose on its own to flush itself to the RAMDir? That should only
happen when getReader is called, and there's still plenty of RAM
free.

So what happens is... each time getReader() is called, we make a new
smallish RAM segment. Over time, these RAM segments need merging so
we merge them. (If such a merge is fairly large, probably instead of
writing to ram it should write the new segment to the real dir, since
intermediate RAM usage will be too high).

At some point, DW detects that the RAMDir size plus its own buffer is
at the limit. If DW's buffer is relatively small, it should probably
simply flush to the RAMDir then dump entire RAMDir to the real dir as
a single merge. If DW's buffer is big, as would happen if you opened
an NRT reader but never actually called getReader(), it should flush
straight to the real dir.

One challenge we face is ensuring that while we are flushing all ram
segments to disk, we don't block the getReader() turnaround. IE we
can't make getReader() do that flush synchronously. So that needs to
be a BG merge, but we must somehow temporarily disregard the size of
those segments while the merge is running. Or, perhaps we "merge RAM
segments to disk" a bit early, eg once RAM consumed is > 90% of the
total RAM buffer, or something.


Jason Rutherglen added a comment - 26/May/09 04:42 PM
I started on the pooled ram model after the last patch because
it is cleaner. Bytes are allocated up to the given limit set by
IW.setRAMBufferSizeMB. As mentioned below, we may want to add a
setting for the max ram temporarily used.

I'm reusing the DocumentsWriter.numBytesAlloc/numBytesUsed and
created a RAMPolicy that manages ramDirBytesAlloc and
ramDirBytes. Each time a merge is scheduled, the
sizeof(segments) is allocated by RAMPolicy and the segmentsAlloc
is stored in OneMerge. Once the merge completes or fails, the
ramDirBytesAlloc is adjusted by the difference between the
actual bytes used and OM.ramDirAlloc. This way we always have
the most accurate ramDir allocation in RamP, and we properly
adjust the amount of ram consumed. This works well with our
concurrent merging model where we can't predict when a merge
will complete.

One challenge we face is ensuring that while we are
flushing all ram segments to disk, we don't block the
getReader() turnaround. IE we can't make getReader() do that
flush synchronously....perhaps we "merge RAM segments to disk" a
bit early, eg once RAM consumed is > 90% of the total RAM
buffer

You're talking about the synchronization in IW.doFlushInternal
which would block getReader while writing a segment to disk? Our
default RAMPolicy should be one where we always flush the ram
buffer to the ramdir. Basically there must always be room in the
ram dir for the ram buffer. ramdir + (rambuf * 2) < maxSize. Or
do we assume that it's ok for ramUsed to temporarily exceed
ramMax by a given percent (110% which would be an option in
RAMPolicy)? while ramBuf is being flushed to ramDir?

We may want to make some assumptions about usage of getReader
(i.e. getReader is called fairly often such that the rambuffer
is usually less than half of the ram used) when flushToRam=true
so that we can get a version of this functionality out the door,
then iterate as we gather feedback from users?

I'll include the comments in the next patch.


Jason Rutherglen made changes - 28/May/09 07:53 PM
Link This issue blocks LUCENE-1667 [ LUCENE-1667 ]
Jason Rutherglen added a comment - 29/May/09 03:49 PM
I had forgotten about concurrency with the docstores (keeping an
open IndexInput and IndexOutput) when using an FSDirectory. I
wrote a test (which fails) so getting this functionality to work
could require some reworking of FSDirectory internals?
(Something on the order of auto updating IndexInput's buffers
and file length as IndexOutput is flushed?)

We need simultaneous IndexInput and IndexOutput ops to work as
docstore is streamed to FSDir for multiple segments. After a
segment is flushed to the ramdir it can be read from, including
the docstore which is still actively being written to (for the
new segments)?


Michael McCandless added a comment - 29/May/09 04:05 PM

I had forgotten about concurrency with the docstores

That's a big (and good) change; I think we should save that one for another issue, and leave this one focusing on flushing segments through a RAMDir?


Jason Rutherglen added a comment - 29/May/09 05:31 PM
I guess I'm confused about the doc stores. We keep one open on
disk for multiple segments being created in the via FSD, which
means the IndexOutput isn't closed, however for each new
SegmentReader that's opened, we're creating a new IndexInput
only after the segment is flushed (to FSD, docstore to disk).

So the concurrent docstores may work without too much changing
of FSDir internals, as the portion of the docstore file that the
SR needs to know about has been flushed to disk when SR is
opened. The SR should then be able to open the docstore file
cleanly regardless of the open IndexOutput adding more to the
file for the next rambuf segment?


Michael McCandless added a comment - 29/May/09 06:01 PM

The SR should then be able to open the docstore file
cleanly regardless of the open IndexOutput adding more to the
file for the next rambuf segment?

That's the crucial question: can we open a new IndexInput, while an IndexOutput is still writing to the file? I vaguely remember being surprised that this worked fine on Windows, but I'm not sure.

If that's fine across all OS's, then, yes we could avoid closing the docStores when flushing a new segment.

If it's not fine, then we'd need a way to make an IndexOutputInput, which is a bigger change.

We also should [separately] consider having multiple SegmentReaders that share the same docStores, share a single set of IndexInputs (cloned). Ie if the RAM MergePolicy allows many segments in RAM at once, we are still opening real file descriptors to read the doc stores, so without such sharing we could start running out of descriptors.


Jason Rutherglen added a comment - 31/May/09 08:24 PM
I want to run the Lucene unit tests in NRT mode without creating and/or
modifying all the test cases. In lieu of adding a
System.property that turns NRT on, have we settled on a
different mechanism for global settings? Perhaps the back compat
type of system can be used here? Or for now a static variable on
IW?

can we open a new IndexInput, while an IndexOutput is
still writing to the file?

I ran a test case successfully that writes to a file while
opening threads that read from flushed sections on windows.

Closing docstores for every flush would seem to cause a lot of
overhead. With NRT + FSD aren't termvector files merged on disk for
every segment?

We also should [separately] consider having multiple
SegmentReaders that share the same docStores

Doesn't FSDir open only one FD per file?


Michael McCandless added a comment - 01/Jun/09 12:52 AM

I want to run the Lucene unit tests in NRT mode without creating and/or
modifying all the test cases.

You can just temporarily set the default then run all tests?

We never reached consensus on a back compat sort of setting...

I ran a test case successfully that writes to a file while
opening threads that read from flushed sections on windows.

Excellent; let's take this up under a new issue? If this holds true across platform (and Windows is the most challenging one) then it'll make sharing much easier.

Closing docstores for every flush would seem to cause a lot of
overhead. With NRT + FSD aren't termvector files merged on disk for
every segment?

Yes it adds overhead, though, it's not on the critical path for opening a new reader since it happens in the BG. So... things like LUCENE-1526 (making deletions, norms incrementally copy-on-write) I think are higher priority.

Doesn't FSDir open only one FD per file?

No, it opens a real file every time openInput is called. I guess we could think about having it share/clone internally?


Jason Rutherglen added a comment - 01/Jun/09 04:26 AM

You can just temporarily set the default then run all
tests?

Or have separate parallel classes inherited classes that set the
static IW variable, run the tests, and perhaps have additional
NRT specific test methods.

I guess we could think about having it share/clone
internally?

Yeah we should, I got this confused with how we're cloning in
most o.a.l.i classes, but if we're calling openInput it seems
like we should have it internally clone, I guess we'll have an
issue with classes that don't IndexInput.close, however it's
better to solve these than open many unnecessary file
descriptors.

let's take this up under a new issue?

The issue would only be the unit test for now, or should it be a
part of an existing issue? Ok, it will clean up LUCENE-1313's
unit test class.


Jason Rutherglen added a comment - 01/Jun/09 07:53 PM
With flushToRAM=true, when IW.commit is called, do we still want
to not have concurrent merges execute synchronously? Or only
wait for the merges to complete that are from ramDir to
primaryDir?

Michael McCandless added a comment - 01/Jun/09 11:52 PM

The issue would only be the unit test for now, or should it be a
part of an existing issue?

I meant the issue of FSDir sharing a single IndexInput if the same file is opened more than once (you opened LUCENE-1671 for this).

I guess we'll have an issue with classes that don't IndexInput.close

This (failing to close an IndexInput that was obtained from Directory.openInput) should be rare in Lucene – we try not to do that. Failing to close clones does happen...

With flushToRAM=true, when IW.commit is called, do we still want
to not have concurrent merges execute synchronously? Or only
wait for the merges to complete that are from ramDir to
primaryDir?

I think only ramDir -> primaryDir? commit() today doens't block on BG merges.


Jason Rutherglen added a comment - 02/Jun/09 06:59 PM

For IW.resolveRAMSegments, maybe we should make a single
merge that merges everything down? Why even bother interacting
with a merge policy, here?

LogMergePolicy.findMergesForOptimize I assumed would merge
segments to a single segment, in testing resolveRAMSegments that
doesn't always seem to be the case?

When I created a mergeAllSegments to one, the
ensureContiguousMerge would throw an exception. I thought about
avoiding ensureContiguousMerge but it required reworking a bunch
of code. Why does ensureContiguousMerge exist? Is it for
assertions or is there a reason segments need to be next to each
other when merging?


Michael McCandless added a comment - 02/Jun/09 07:29 PM

LogMergePolicy.findMergesForOptimize I assumed would merge
segments to a single segment, in testing resolveRAMSegments that
doesn't always seem to be the case?

It should, but it will respect mergeFactor in the process (ie do multiple merges if necessary). I'm thinking we should just merge all the RAM segments in one go, and not consult merge policy, in resolveRAMSegments.

When I created a mergeAllSegments to one, the
ensureContiguousMerge would throw an exception.

Why does this throw an exception? Ie you passed in the in-order RAM segments?

Why does ensureContiguousMerge exist? Is it for
assertions or is there a reason segments need to be next to each
other when merging?

I think the only thing that'd break is IndexWriter now assumes continuity when it removes the old segments and puts the new one in. LUCENE-1076 explores allowing MergePolicy to select non-contiguous merges.

But: with autoCommit=false, in order to avoid merging the doc stores, the segments (even RAM segments) must be contiguous. This is a sizable performance gain when building a large index in one IndexWriter session.


Jason Rutherglen added a comment - 02/Jun/09 08:44 PM
Just so we don't forget, we need to test flushToRAM with a custom IndexDeletionPolicy which could be a bit tricky.

Jason Rutherglen added a comment - 05/Jun/09 04:44 AM
  • RAM buffer size is stored in the writer rather than set into
    DocumentsWriter. This is due to the actual ram buffer limit in
    NRT changing depending on the size of the ramdir.
  • NRTMergePolicy and IW.resolveRAMSegments merges all ram dir
    segments to primaryDir (i.e. disk) when the ramDir is over
    totalMax, or any new merges would put ramDir over totalMax.
  • In DocumentsWriter we have a set limit on the buffer size
    which is (tempMax - ramDirSize)/2. This keeps the total ram used
    under the totalMax (or IW.maxBufferSize), while also keeping our
    temporary ram usage under the tempMax amount. When DW.ramBuffer
    limit is reached, it's auto flushed to the ramDir.
  • All tests pass except TestIndexWriterRAMDir.testFSDirectory.
    Will look into this further. When flushToRAM is on by default,
    there seems to be deadlock in
    org.apache.lucene.TestMergeSchedulerExternal, however when I
    tried to see if there is any via jconsole by setting
    ANT_OPTS="-Dcom.sun.management.jmxremote" I didn't see any. I'm
    not sure if this is due to not connecting to the right process?
    Or something else.
  • Added testReadDocuments which insures we can read documents
    we've flushed to disk. This essentially tests our ability to
    simultaneously read and write documents to and from the
    docstore. It seemd to work on Windows.
  • I think there's more that can be done to more accurately
    manage the RAM however I think the way it works is a good
    starting point.

Jason Rutherglen made changes - 05/Jun/09 04:44 AM
Attachment LUCENE-1313.patch [ 12409931 ]
Mark Miller added a comment - 15/Jun/09 01:56 PM
Whats the verdict on this one Mike? Got the impression this was a likely 3.1 ...

Michael McCandless added a comment - 15/Jun/09 03:35 PM
OK let's push it to 3.1. It's very much in progress, but 1) the iterations are slow (it's a big patch), 2) it's a biggish change so I'd prefer to it shortly after a release, not shortly before, so it has plenty of time to "bake" on trunk.

Michael McCandless made changes - 15/Jun/09 03:35 PM
Fix Version/s 2.9 [ 12312682 ]
Fix Version/s 3.1 [ 12314025 ]
Jason Rutherglen added a comment - 15/Jun/09 05:45 PM
Just wanted to give an update on this, I'm running the unit
tests with flushToRAM=true, the ones that fail are (mostly)
tests that look for files when they're now in RAM (temporarily)
and the like. I'm not sure what to do with these tests, 1)
ignore them (kind of hard to not run specific methods, I think)
2) or conditionalize them to run only if flushToRAM=false.

Jason Rutherglen made changes - 15/Jun/09 05:45 PM
Summary Realtime Search Near Realtime Search
Jason Rutherglen added a comment - 15/Jun/09 08:56 PM
TestThreadedOptimize is throwing a ensureContiguousMerge
exception. I think this is highlighting the change to merging
all ram segments to a single primaryDir segment can sometimes
lead to choosing segments that are non-contiguous? I'm not sure
of the best way to handle this.

Michael McCandless added a comment - 16/Jun/09 02:25 PM

I think this is highlighting the change to merging
all ram segments to a single primaryDir segment can sometimes
lead to choosing segments that are non-contiguous?

I don't see why that results in a non-contiguous merge? Ie, at all times the RAM segments should be on the tail of SegmentInfos? So if you merge all of them, in order, that merge should be contiguous?


Michael McCandless added a comment - 16/Jun/09 02:26 PM

conditionalize them to run only if flushToRAM=false.

That seems good?


Jason Rutherglen added a comment - 18/Jun/09 12:55 AM
The patch is cleaned up. A static variable IndexWriter.GLOBALNRT
is added, which allows all the tests to be run with
flushToRAM=true. I reran the tests which hopefully still work as
intended. Tests that looked for specific file names were changed
to work with NRT. Some of the tests are skipped entirely and
need to be written specifically for flushToRAM.
  • TestIndexWriterMergePolicy,TestBackwardsCompatibility failures
    are expected
  • TestIndexWriterRAMDir.testFSDirectory fails (will be fixed)
  • TestThreadedOptimize ensureContiguousMerge fails. This one is
    a bit mysterious, perhaps the correct assertion will show where
    it's going wrong.

I need to go through and mark the tests that can be converted to
be NRT specific.


Jason Rutherglen made changes - 18/Jun/09 12:55 AM
Attachment LUCENE-1313.patch [ 12411017 ]
Jason Rutherglen added a comment - 18/Jun/09 10:52 PM
  • TestThreadedOptimize passes, LogMergePolicy now filters the
    segmentInfos based on the dir, rather than NRTMergePolicy
    passing in only ramInfos or primaryInfos. LogMergePolicy is
    careful to select contiguous segments, by passing in a subset of
    segmentInfos, the merge policy selection broke down.
  • TestIndexWriter.testAddIndexOnDiskFull,
    testAddIndexesWithCloseNoWait fails, which I don't think
    happened before. testAddIndexOnDiskFull fails when
    autoCommit=true which I'm not sure is a valid test by the time
    this patch goes in but it probably needs to be looked into.

The other previous notes are still valid.


Jason Rutherglen made changes - 18/Jun/09 10:52 PM
Attachment LUCENE-1313.patch [ 12411152 ]
Jason Rutherglen added a comment - 19/Jun/09 06:46 PM
Using a single segmentInfos in IW seems to be problematic as
we'll always potentially have different dir non-contiguous
infos. I'm seeing the error off and on in different test cases.
I will put together a patch separating the two dir infos in IW.

Jason Rutherglen added a comment - 19/Jun/09 06:56 PM
On second thought, the previous idea would require quite a bit
of work. Perhaps OneMerge can have the segmentInfos (ramDir or
primaryDir) they were selected from and the ensureContiguous can
verify that? Then we'd adjust commitMerge to remove the newly
merged segments individually.

I'll give this a try.


Jason Rutherglen added a comment - 22/Jun/09 10:16 PM
It's progressing. Randomly some tests fail such as the one noted below.
  • TestIndexWriter.testAddIndexesWithCloseNoWait fails with
    "rollback() was called or addIndexes* hit an unhandled
    exception", TestCrash.testWriterAfterCrash fails with " [junit]
    java.io.FileNotFoundException: _a.fnm [junit] at
    org.apache.lucene.store.MockRAMDirectory.openInput(MockRAMDirecto
    ry.java:252) [junit] at
    org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:67)"
  • assert in the ctor of MergeDocIDRemapper removed (not yet sure
    how to replace it)
  • OneMerge.fromInfos is added which is the set of segmentinfos
    the merge was selected from. This is for ensureContiguousMerge
    where it's failing because we have essentially two different
    sets of segmentInfos (ram and primaryDir) in the
    IW.segmentInfos. They are not related, but for convenience are
    kept together for most of IW, then are separated out in the
    merge policy. If the goal of ensureContiguousMerge is to keep
    docStoreSegments together, this will work as ramDir and
    primaryDir docStores should not need to be adjacent (I think,
    and need to verify).

Jason Rutherglen made changes - 22/Jun/09 10:16 PM
Attachment LUCENE-1313.patch [ 12411466 ]
Jason Rutherglen added a comment - 30/Jun/09 09:28 PM
Expected tests fail, the previous tests that were failing pass
on 2 runs. I didn't make any changes though so am suspicious!

Jason Rutherglen made changes - 30/Jun/09 09:28 PM
Attachment LUCENE-1313.patch [ 12412209 ]
Jason Rutherglen made changes - 14/Jul/09 06:38 PM
Summary Near Realtime Search Near Realtime Search (using a built in RAMDirectory)
Jason Rutherglen made changes - 15/Jul/09 12:50 AM
Link This issue blocks LUCENE-1738 [ LUCENE-1738 ]
Jason Rutherglen made changes - 15/Jul/09 12:50 AM
Link This issue blocks LUCENE-1738 [ LUCENE-1738 ]
Jason Rutherglen made changes - 15/Jul/09 01:21 AM
Link This issue blocks SOLR-1278 [ SOLR-1278 ]
Jason Rutherglen made changes - 28/Aug/09 05:24 PM
Link This issue relates to LUCENE-1577 [ LUCENE-1577 ]
Jason Rutherglen added a comment - 28/Aug/09 05:24 PM
Benchmark of LUCENE-1313

Jason Rutherglen added a comment - 23/Sep/09 02:03 AM
Patch is updated to trunk and compiles. Unit tests fail because
they're hardcoded for specific numbers of files etc, otherwise I
don't think there's anything glaring.

The included benchmark indexes 1 document, queries with a sort,
and repeats N times. This is way too taxing using the existing
IW.getReader when pointing to a hard drive, basically the
machine becomes unusable. I haven't figured out what to publish
or how long I want to wait. I'll just run it overnight.

This patch's benchmark fairs much better, I'll publish it's
results as well. Thinking 1, 10, 25, 50, 100, 1000 doc
increments should suffice, so there's some variation.


Jason Rutherglen made changes - 23/Sep/09 02:03 AM
Attachment LUCENE-1313.patch [ 12420342 ]