Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7302

IndexWriter should tell you the order of indexing operations


    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 6.2, 7.0
    • None
    • None
    • New


      Today, when you use multiple threads to concurrently index, Lucene
      knows the effective order that those operations were applied to the
      index, but doesn't return that information back to you.

      But this is important to know, if you want to build a reliable search
      API on top of Lucene. Combined with the recently added NRT
      replication (LUCENE-5438) it can be a strong basis for an efficient
      distributed search API.

      I think we should return this information, since we already have it,
      and since it could simplify servers (ES/Solr) on top of Lucene:

      • They would not require locking preventing the same id from being
        indexed concurrently since they could instead check the returned
        sequence number to know which update "won", for features like
        "realtime get". (Locking is probably still needed for features
        like optimistic concurrency).
      • When re-applying operations from a prior commit point, e.g. on
        recovering after a crash from a transaction log, they can know
        exactly which operations made it into the commit and which did
        not, and replay only the truly missing operations.

      Not returning this just hurts people who try to build servers on top
      with clear semantics on crashing/recovering ... I also struggled with
      this when building a simple "server wrapper" on top of Lucene


        1. LUCENE-7032.patch
          80 kB
          Michael McCandless
        2. LUCENE-7132.patch
          11 kB
          Michael McCandless



            mikemccand Michael McCandless
            mikemccand Michael McCandless
            0 Vote for this issue
            6 Start watching this issue