Description
Today, when you use multiple threads to concurrently index, Lucene
knows the effective order that those operations were applied to the
index, but doesn't return that information back to you.
But this is important to know, if you want to build a reliable search
API on top of Lucene. Combined with the recently added NRT
replication (LUCENE-5438) it can be a strong basis for an efficient
distributed search API.
I think we should return this information, since we already have it,
and since it could simplify servers (ES/Solr) on top of Lucene:
- They would not require locking preventing the same id from being
indexed concurrently since they could instead check the returned
sequence number to know which update "won", for features like
"realtime get". (Locking is probably still needed for features
like optimistic concurrency).
- When re-applying operations from a prior commit point, e.g. on
recovering after a crash from a transaction log, they can know
exactly which operations made it into the commit and which did
not, and replay only the truly missing operations.
Not returning this just hurts people who try to build servers on top
with clear semantics on crashing/recovering ... I also struggled with
this when building a simple "server wrapper" on top of Lucene
(LUCENE-5376).