Lucene - Core
  1. Lucene - Core
  2. LUCENE-2047

IndexWriter should immediately resolve deleted docs to docID in near-real-time mode

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 4.9, 5.0
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Spinoff from LUCENE-1526.

      When deleteDocuments(Term) is called, we currently always buffer the
      Term and only later, when it's time to flush deletes, resolve to
      docIDs. This is necessary because we don't in general hold
      SegmentReaders open.

      But, when IndexWriter is in NRT mode, we pool the readers, and so
      deleting in the foreground is possible.

      It's also beneficial, in that in can reduce the turnaround time when
      reopening a new NRT reader by taking this resolution off the reopen
      path. And if multiple threads are used to do the deletion, then we
      gain concurrency, vs reopen which is not concurrent when flushing the
      deletes.

      1. LUCENE-2047.patch
        6 kB
        Jason Rutherglen
      2. LUCENE-2047.patch
        10 kB
        Jason Rutherglen
      3. LUCENE-2047.patch
        13 kB
        Jason Rutherglen
      4. LUCENE-2047.patch
        15 kB
        Jason Rutherglen
      5. LUCENE-2047.patch
        16 kB
        Jason Rutherglen
      6. LUCENE-2047.patch
        17 kB
        Jason Rutherglen
      7. LUCENE-2047.patch
        18 kB
        Jason Rutherglen

        Activity

        Hide
        Jason Rutherglen added a comment -

        Is this going to be an option or default to true only when NRT is on?

        Also, I can create a patch for this.

        Show
        Jason Rutherglen added a comment - Is this going to be an option or default to true only when NRT is on? Also, I can create a patch for this.
        Hide
        Michael McCandless added a comment -

        I think simply default to true.

        Yes, please cons up a patch!

        Show
        Michael McCandless added a comment - I think simply default to true. Yes, please cons up a patch!
        Hide
        Jason Rutherglen added a comment -

        Deletes occur immediately if poolReader is true

        I'm not sure updateDocument needs to delete immediately, as it's also writing a document, the deletes later would be lost in the noise.

        Show
        Jason Rutherglen added a comment - Deletes occur immediately if poolReader is true I'm not sure updateDocument needs to delete immediately, as it's also writing a document, the deletes later would be lost in the noise.
        Hide
        Michael McCandless added a comment -

        Thanks Jason!

        I would think we do want to delete live for updateDocument as well? It's not clear the deletion will be in the noise (if it hits a disk seek, it won't).

        Show
        Michael McCandless added a comment - Thanks Jason! I would think we do want to delete live for updateDocument as well? It's not clear the deletion will be in the noise (if it hits a disk seek, it won't).
        Hide
        Michael McCandless added a comment -

        Pushing to 3.1...

        Show
        Michael McCandless added a comment - Pushing to 3.1...
        Hide
        Jason Rutherglen added a comment -

        I would think we do want to delete live for
        updateDocument as well? It's not clear the deletion will be in
        the noise (if it hits a disk seek, it won't).

        The delete shouldn't hit a disk seek because with poolReaders,
        the deletes are applied and carried over in RAM. I'd assume the
        overhead of flushing the segment would be the main consumption
        of CPU and perhaps IO? Which with LUCENE-1313, is less of an
        issue.

        For consistency with deleteDocument, I'll add live deleting to
        updateDocument.

        Show
        Jason Rutherglen added a comment - I would think we do want to delete live for updateDocument as well? It's not clear the deletion will be in the noise (if it hits a disk seek, it won't). The delete shouldn't hit a disk seek because with poolReaders, the deletes are applied and carried over in RAM. I'd assume the overhead of flushing the segment would be the main consumption of CPU and perhaps IO? Which with LUCENE-1313 , is less of an issue. For consistency with deleteDocument, I'll add live deleting to updateDocument.
        Hide
        Michael McCandless added a comment -

        We could very likely hit seeks resolving the term -> docID. EG to consult the terms dict index (eg page fault), terms dict itself, and then to load the posting. (Once we have pulsing, we can save that seek).

        Show
        Michael McCandless added a comment - We could very likely hit seeks resolving the term -> docID. EG to consult the terms dict index (eg page fault), terms dict itself, and then to load the posting. (Once we have pulsing, we can save that seek).
        Hide
        Jason Rutherglen added a comment -

        likely hit seeks resolving the term -> docID

        Right, duh! Hopefully we won't get into the delete by doc id discussion again.

        Show
        Jason Rutherglen added a comment - likely hit seeks resolving the term -> docID Right, duh! Hopefully we won't get into the delete by doc id discussion again.
        Hide
        Jason Rutherglen added a comment -

        I added deleting live for updateDocument.

        TestNRTReaderWithThreads and TestIndexWriterReader passes.

        Show
        Jason Rutherglen added a comment - I added deleting live for updateDocument. TestNRTReaderWithThreads and TestIndexWriterReader passes.
        Hide
        Jason Rutherglen added a comment -

        I'd like to remove the syncing in the deleteLive methods, however this causes other tangential exceptions.

        Show
        Jason Rutherglen added a comment - I'd like to remove the syncing in the deleteLive methods, however this causes other tangential exceptions.
        Hide
        Jason Rutherglen added a comment -

        Without syncing on the writer, we run into the SR's files being deleted out from under it while it's being obtained for deletion (I think).

        Show
        Jason Rutherglen added a comment - Without syncing on the writer, we run into the SR's files being deleted out from under it while it's being obtained for deletion (I think).
        Hide
        Jason Rutherglen added a comment -

        After thumbing a bit through the code, somehow, and this is a
        bit too complex for me to venture a guess on, we'd need to
        prevent the deletion of the SR's files we're deleting from, even
        if that SR is no longer live. Which means possibly interacting
        with the deletion policy, or some other method. It's a bit hairy
        in the segment info transaction shuffling that goes on in IW.

        I believe asyncing the live deletes is a good thing, as that way
        we're taking advantage of concurrency. The possible expense is in
        deleting from segments that are no longer live while the
        deleting is occurring.

        Show
        Jason Rutherglen added a comment - After thumbing a bit through the code, somehow, and this is a bit too complex for me to venture a guess on, we'd need to prevent the deletion of the SR's files we're deleting from, even if that SR is no longer live. Which means possibly interacting with the deletion policy, or some other method. It's a bit hairy in the segment info transaction shuffling that goes on in IW. I believe asyncing the live deletes is a good thing, as that way we're taking advantage of concurrency. The possible expense is in deleting from segments that are no longer live while the deleting is occurring.
        Hide
        Jason Rutherglen added a comment -

        I think there's a fundamental design issue here. What happens to
        documents that need to be deleted but are still in the RAM
        buffer? Because we're not buffering the deletes those wouldn't
        be applied. Or would we still buffer the deletes, then only
        apply them only to the new SR created from the RAM buffer when
        flush or getReader is called?

        Show
        Jason Rutherglen added a comment - I think there's a fundamental design issue here. What happens to documents that need to be deleted but are still in the RAM buffer? Because we're not buffering the deletes those wouldn't be applied. Or would we still buffer the deletes, then only apply them only to the new SR created from the RAM buffer when flush or getReader is called?
        Hide
        Michael McCandless added a comment -

        we'd need to
        prevent the deletion of the SR's files we're deleting from, even
        if that SR is no longer live.

        It's strange that anything here is needed, because, when you check a
        reader out from the pool, it's incRef'd, which should mean the files
        need no protection. Something strange is up... could it be that when
        you checkout that reader to do deletions, it wasn't already open, and
        then on trying to open it, its files were already deleted? (In which
        case, that segment has been merged away, and, the merge has committed,
        ie already carried over all deletes, and so you should instead be
        deleting against that merged segment).

        So I think the sync(IW) is in fact necessary? Note that the current
        approach (deferring resolving term -> docIDs until flush time) aiso
        sync(IW)'d, so we're not really changing that, here. Though I agree
        it would be nice to not have to sync(IW). Really what we need to sync
        on is "any merge that is merging this segment away and now wants to
        commit". That's actually a very narrow event so someday (separate
        issue) if we could refine the sync'ing to that, it should be a good
        net throughput improvement for updateDocument.

        What happens to
        documents that need to be deleted but are still in the RAM
        buffer?

        Ahh, yes. We must still buffer for this case, and resolve these
        deletes against the newly flushed segment. I think we need a separate
        buffer that tracks pending delete terms only against the RAM buffer?

        Also, instead of actually setting the bits in SR's deletedDocs, I
        think you should buffer the deleted docIDs into DW's
        deletesInRAM.docIDs? Ie, we do the resolution of Term/Query -> docID,
        but buffer the docIDs we resolved to. This is necessary for
        correctness in exceptional situations, eg if you do a bunch of
        updateDocuments, then DW hits an aborting exception (meaning its RAM
        buffer may be corrupt) then DW currently discards the RAM buffer, but,
        leaves previously flushed segments intact, so that if you then commit,
        you have a consistent index. Ie, in that situation, we don't want the
        docs deleted by updateDocument calls to be committed to the index, so
        we need to buffer them.

        Show
        Michael McCandless added a comment - we'd need to prevent the deletion of the SR's files we're deleting from, even if that SR is no longer live. It's strange that anything here is needed, because, when you check a reader out from the pool, it's incRef'd, which should mean the files need no protection. Something strange is up... could it be that when you checkout that reader to do deletions, it wasn't already open, and then on trying to open it, its files were already deleted? (In which case, that segment has been merged away, and, the merge has committed, ie already carried over all deletes, and so you should instead be deleting against that merged segment). So I think the sync(IW) is in fact necessary? Note that the current approach (deferring resolving term -> docIDs until flush time) aiso sync(IW)'d, so we're not really changing that, here. Though I agree it would be nice to not have to sync(IW). Really what we need to sync on is "any merge that is merging this segment away and now wants to commit". That's actually a very narrow event so someday (separate issue) if we could refine the sync'ing to that, it should be a good net throughput improvement for updateDocument. What happens to documents that need to be deleted but are still in the RAM buffer? Ahh, yes. We must still buffer for this case, and resolve these deletes against the newly flushed segment. I think we need a separate buffer that tracks pending delete terms only against the RAM buffer? Also, instead of actually setting the bits in SR's deletedDocs, I think you should buffer the deleted docIDs into DW's deletesInRAM.docIDs? Ie, we do the resolution of Term/Query -> docID, but buffer the docIDs we resolved to. This is necessary for correctness in exceptional situations, eg if you do a bunch of updateDocuments, then DW hits an aborting exception (meaning its RAM buffer may be corrupt) then DW currently discards the RAM buffer, but, leaves previously flushed segments intact, so that if you then commit, you have a consistent index. Ie, in that situation, we don't want the docs deleted by updateDocument calls to be committed to the index, so we need to buffer them.
        Hide
        Jason Rutherglen added a comment -

        It's strange that anything here is needed

        I was obtaining the segment infos synced, had a small block of
        unsynced code, then synced obtaining the sometimes defunct
        readers. Fixed that part, then the errors went away!

        the sync(IW) is in fact necessary?

        I'm hoping we can do the deletes unsynced, which will make this
        patch a net performance gain because we're allowing multiple
        threads to delete concurrently (whereas today we're performing
        them synced at flush time, i.e. the current patch is merely
        shifting the term/query lookup cost from flush to
        deleteDocument).

        buffer the deleted docIDs into DW's deletesInRAM.docIDs

        I'll need to step through this, as it's a little strange to me
        how DW knows the doc id to cache for a particular SR, i.e. how
        are they mapped to an SR? Oh there's the DW.remapDeletes method?
        Hmm...

        Couldn't we save off a per SR BV for the update doc rollback
        case, merging the special updated doc BV into the SR's deletes
        on successful flush, throwing them away on failure? Memory is
        less of a concern with the paged BV from the pending LUCENE-1526
        patch. On a delete by query with many hits, I'm concerned about
        storing too many doc id Integers in BufferedDeletes.

        Without syncing, new deletes could arrive, and we'd need to
        queue them, and apply them to new segments, or newly merged
        segments because we're not locking the segments. Otherwise some
        deletes could be lost.

        A possible solution is, deleteDocument would synchronously add
        the delete query/term to a queue per SR and return.
        Asynchronously (i.e. in background threads) the deletes could be
        applied. Merging would aggregate the incoming SR's queued
        deletes (as they haven't been applied yet) into the merged
        reader's delete queue. On flush we'd wait for these queued
        deletes to be applied. After flush, the queues would be clear
        and we'd start over. And because the delete queue is per reader,
        it would be thrown away with the closed reader.

        Show
        Jason Rutherglen added a comment - It's strange that anything here is needed I was obtaining the segment infos synced, had a small block of unsynced code, then synced obtaining the sometimes defunct readers. Fixed that part, then the errors went away! the sync(IW) is in fact necessary? I'm hoping we can do the deletes unsynced, which will make this patch a net performance gain because we're allowing multiple threads to delete concurrently (whereas today we're performing them synced at flush time, i.e. the current patch is merely shifting the term/query lookup cost from flush to deleteDocument). buffer the deleted docIDs into DW's deletesInRAM.docIDs I'll need to step through this, as it's a little strange to me how DW knows the doc id to cache for a particular SR, i.e. how are they mapped to an SR? Oh there's the DW.remapDeletes method? Hmm... Couldn't we save off a per SR BV for the update doc rollback case, merging the special updated doc BV into the SR's deletes on successful flush, throwing them away on failure? Memory is less of a concern with the paged BV from the pending LUCENE-1526 patch. On a delete by query with many hits, I'm concerned about storing too many doc id Integers in BufferedDeletes. Without syncing, new deletes could arrive, and we'd need to queue them, and apply them to new segments, or newly merged segments because we're not locking the segments. Otherwise some deletes could be lost. A possible solution is, deleteDocument would synchronously add the delete query/term to a queue per SR and return. Asynchronously (i.e. in background threads) the deletes could be applied. Merging would aggregate the incoming SR's queued deletes (as they haven't been applied yet) into the merged reader's delete queue. On flush we'd wait for these queued deletes to be applied. After flush, the queues would be clear and we'd start over. And because the delete queue is per reader, it would be thrown away with the closed reader.
        Hide
        Jason Rutherglen added a comment -

        Also, if the per SR delete queue were implemented, we could expose
        the callback, and allow users to delete by doc id, edit norms
        (and in the future, update field caches) for a particular
        IndexReader. We'd pass the reader via a callback that resembles
        IndexReaderWarmer, then deletes, norms updates, etc, could be
        performed like they can be today with a non-readonly IR.

        Show
        Jason Rutherglen added a comment - Also, if the per SR delete queue were implemented, we could expose the callback, and allow users to delete by doc id, edit norms (and in the future, update field caches) for a particular IndexReader. We'd pass the reader via a callback that resembles IndexReaderWarmer, then deletes, norms updates, etc, could be performed like they can be today with a non-readonly IR.
        Hide
        Michael McCandless added a comment -

        I'm hoping we can do the deletes unsynced, which will make this
        patch a net performance gain because we're allowing multiple
        threads to delete concurrently (whereas today we're performing
        them synced at flush time, i.e. the current patch is merely
        shifting the term/query lookup cost from flush to
        deleteDocument).

        Merely shifting the cost off the critical reopen path is already a
        good step forward

        But I agree, if we can also allow deletes to happen concurrently, that
        would be fabulous.

        Currently the buffered deleted docIDs in DW are only due to exceptions
        hit while adding a document. EG say for a given doc there was an
        exception during analysis, after yielding 100 tokens. At this point
        DW's RAM buffer (postings) has already been changed to hold this
        docID, and we can't easily undo that. So we instead mark the docID as
        immediately deleted. These deleted docIDs can then carry in RAM for
        some time, and get remapped when merges commit.

        On a delete by query with many hits, I'm concerned about storing too many doc id Integers in BufferedDeletes.

        I agree, that's a risk if we are buffering an Integer for each deleted
        docID, so we should avoid that.

        We really just need an efficient IntSet. Or, it's even possible a
        data structure that does no deduping (eg a simple growable int[])
        would be fine, since apps would tend not to delete the same docID over
        and over. Since we flush when deletes use too much RAM, we'd be
        protected from silly apps...

        Or maybe start with a growable int[], but cutover to BV once that's
        too large? This would protect the worst case of deleting by a single
        query matching many docs in the index.

        Whatever data structure it is, it should live under BufferedDeletes,
        so the exception logic that already discards these deletes on hitting
        an aborting exception will just continue to work.

        Memory is less of a concern with the paged BV from the pending LUCENE-1526 patch

        I'm not sure we even want to switch to a paged BV for NRT in general
        (just added comment to that effect), though I guess for just this case
        (buffering deletes in IW in NRT mode) it could make sense?

        Couldn't we save off a per SR BV for the update doc rollback
        case, merging the special updated doc BV into the SR's deletes
        on successful flush, throwing them away on failure?

        I like that approach. Because it means the normal case (no exception
        is hit) is very fast (no merging of the two BufferedDeletes, on flush,
        like we do today; and no docID remapping required), and only on
        exception must we restore the previously saved deletes.

        Another alternative would be to redefine the semantics of IW on
        hitting an error.... right now, if you hit an aborting exception (one
        that may have corrupted DW's internal RAM postings), you only lose
        those docs buffered in DW at the time. Any already flushed segments &
        new deletions within this IW session are preserved. So in theory if
        you closed your IW, those changes would persist.

        We could consider relaxing this, such that the entire session of IW is
        rolled back, to the last commit/when-IW-was-opened, just like we now
        do with OOME.

        A possible solution is, deleteDocument would synchronously add
        the delete query/term to a queue per SR and return.
        Asynchronously (i.e. in background threads) the deletes could be
        applied.

        I'd prefer not to add further BG threads to IW, ie, take the app's
        thread to perform the deletion. If the app wants concurrency for
        deletes, it can use multiple threads.

        I believe we only have to sync on the merge committing its deletes,
        right? So we should make a separate lock for that?

        And, on commit() we must also wait for all in-flight deletes to
        finish.

        Finally, when a new segment is flushed, we should resolve the buffered
        Term/Query deletions against it, during the flush?

        Also, if the per SR delete queue were implemented, we could expose
        the callback, and allow users to delete by doc id, edit norms
        (and in the future, update field caches) for a particular
        IndexReader. We'd pass the reader via a callback that resembles
        IndexReaderWarmer, then deletes, norms updates, etc, could be
        performed like they can be today with a non-readonly IR.

        I'd like to strongly decouple this discussion of extensibility, from
        what we're doing in this issue. I think it's a good idea, but let's
        handle it separately – maybe under LUCENE-2026 (refactoring IW,
        which'd pull out the reader pool). This issue should all be "under
        the hood" improvements.

        Show
        Michael McCandless added a comment - I'm hoping we can do the deletes unsynced, which will make this patch a net performance gain because we're allowing multiple threads to delete concurrently (whereas today we're performing them synced at flush time, i.e. the current patch is merely shifting the term/query lookup cost from flush to deleteDocument). Merely shifting the cost off the critical reopen path is already a good step forward But I agree, if we can also allow deletes to happen concurrently, that would be fabulous. Currently the buffered deleted docIDs in DW are only due to exceptions hit while adding a document. EG say for a given doc there was an exception during analysis, after yielding 100 tokens. At this point DW's RAM buffer (postings) has already been changed to hold this docID, and we can't easily undo that. So we instead mark the docID as immediately deleted. These deleted docIDs can then carry in RAM for some time, and get remapped when merges commit. On a delete by query with many hits, I'm concerned about storing too many doc id Integers in BufferedDeletes. I agree, that's a risk if we are buffering an Integer for each deleted docID, so we should avoid that. We really just need an efficient IntSet. Or, it's even possible a data structure that does no deduping (eg a simple growable int[]) would be fine, since apps would tend not to delete the same docID over and over. Since we flush when deletes use too much RAM, we'd be protected from silly apps... Or maybe start with a growable int[], but cutover to BV once that's too large? This would protect the worst case of deleting by a single query matching many docs in the index. Whatever data structure it is, it should live under BufferedDeletes, so the exception logic that already discards these deletes on hitting an aborting exception will just continue to work. Memory is less of a concern with the paged BV from the pending LUCENE-1526 patch I'm not sure we even want to switch to a paged BV for NRT in general (just added comment to that effect), though I guess for just this case (buffering deletes in IW in NRT mode) it could make sense? Couldn't we save off a per SR BV for the update doc rollback case, merging the special updated doc BV into the SR's deletes on successful flush, throwing them away on failure? I like that approach. Because it means the normal case (no exception is hit) is very fast (no merging of the two BufferedDeletes, on flush, like we do today; and no docID remapping required), and only on exception must we restore the previously saved deletes. Another alternative would be to redefine the semantics of IW on hitting an error.... right now, if you hit an aborting exception (one that may have corrupted DW's internal RAM postings), you only lose those docs buffered in DW at the time. Any already flushed segments & new deletions within this IW session are preserved. So in theory if you closed your IW, those changes would persist. We could consider relaxing this, such that the entire session of IW is rolled back, to the last commit/when-IW-was-opened, just like we now do with OOME. A possible solution is, deleteDocument would synchronously add the delete query/term to a queue per SR and return. Asynchronously (i.e. in background threads) the deletes could be applied. I'd prefer not to add further BG threads to IW, ie, take the app's thread to perform the deletion. If the app wants concurrency for deletes, it can use multiple threads. I believe we only have to sync on the merge committing its deletes, right? So we should make a separate lock for that? And, on commit() we must also wait for all in-flight deletes to finish. Finally, when a new segment is flushed, we should resolve the buffered Term/Query deletions against it, during the flush? Also, if the per SR delete queue were implemented, we could expose the callback, and allow users to delete by doc id, edit norms (and in the future, update field caches) for a particular IndexReader. We'd pass the reader via a callback that resembles IndexReaderWarmer, then deletes, norms updates, etc, could be performed like they can be today with a non-readonly IR. I'd like to strongly decouple this discussion of extensibility, from what we're doing in this issue. I think it's a good idea, but let's handle it separately – maybe under LUCENE-2026 (refactoring IW, which'd pull out the reader pool). This issue should all be "under the hood" improvements.
        Hide
        Michael McCandless added a comment -

        Thinking more on this... I'm actually no longer convinced that this
        change is worthwhile.

        Net/net this will not improve the dps/qps throughput on a given fixed
        hardware, because this is a zero sum game: the deletes must be
        resolved one way or another.

        Whether we do it in batch (as today), or incrementally/concurrently,
        one at a time as they arrive, the same work must be done. In fact,
        batch should be less costly in practice since it clearly has temporal
        locality in resolving terms -> postings, so on a machine whose IO
        cache can't hold the entire index in RAM, bulk flushing should be
        a win.

        It's true that net latency of reopen will be reduced by being
        incremental, but Lucene shouldn't aim to be able to reopen 100s of
        times per second: I think that's a mis-feature (most apps don't need
        it), and those that really do can and should use an approach like
        Zoie.

        Finally, one can always set the max buffered delete terms/docs to
        something low, to achieve this same tradeoff. It's true that won't
        get you concurrent resolving of deleted Terms -> docIDs, but I bet in
        practice that concurrency isn't necessary (ie the performance of a
        single thread resolving all buffered deletes is plenty fast).

        If the reopen time today is plenty fast, especially if you configure
        your writer to flush often, then I don't think we need incremental
        resolving of the deletions?

        Show
        Michael McCandless added a comment - Thinking more on this... I'm actually no longer convinced that this change is worthwhile. Net/net this will not improve the dps/qps throughput on a given fixed hardware, because this is a zero sum game: the deletes must be resolved one way or another. Whether we do it in batch (as today), or incrementally/concurrently, one at a time as they arrive, the same work must be done. In fact, batch should be less costly in practice since it clearly has temporal locality in resolving terms -> postings, so on a machine whose IO cache can't hold the entire index in RAM, bulk flushing should be a win. It's true that net latency of reopen will be reduced by being incremental, but Lucene shouldn't aim to be able to reopen 100s of times per second: I think that's a mis-feature (most apps don't need it), and those that really do can and should use an approach like Zoie. Finally, one can always set the max buffered delete terms/docs to something low, to achieve this same tradeoff. It's true that won't get you concurrent resolving of deleted Terms -> docIDs, but I bet in practice that concurrency isn't necessary (ie the performance of a single thread resolving all buffered deletes is plenty fast). If the reopen time today is plenty fast, especially if you configure your writer to flush often, then I don't think we need incremental resolving of the deletions?
        Hide
        Jason Rutherglen added a comment -

        I want to replay how DW handle the updateDoc call to see if my
        understanding is correct.

        1: Analyzing hits an exception for a doc, it's doc id has
        already been allocated so we mark it for deletion later (on
        flush?) in BufferedDeletes.

        2: RAM Buffer writing hits an exception, we've had updates which
        marked deletes in current segments, however they haven't been
        applied yet because they're stored in BufferedDeletes docids.
        They're applied on successful flush.

        Are these the two scenarios correct or am I completely off
        target? If correct, isn't update doc already deleting in the
        foreground?

        prefer not to add further BG threads

        Maybe we can use 1.5's ReentrantReadWriteLock to effectively
        allow multiple del/update doc calls to concurrently acquire the
        read lock, and perform the deletes in the foreground. The write
        lock could be acquired during commitDeletes, commit(), and after
        a segment is flushed? I'm not sure it would be necessary to
        acquire this write lock anytime segment infos is changed?

        I think it's important to remove unnecessary global locks on
        unitary operations (like deletes). We've had great results
        removing these locks for isDeleted, (NIO)FSDirectory where we
        didn't think there'd be an improvement, and there was. I think
        this patch (or a follow on one that implements the shared lock
        solution) could effectively increase throughput (for deleting
        and updating), measurably.

        Lucene shouldn't aim to be able to reopen 100s of times
        per second

        Reopening after every doc could be a valid case that I suspect
        will come up again in the future. I don't think it's too hard to
        support.

        It's true that net latency of reopen will be reduced by
        being incremental, but Lucene shouldn't aim to be able to reopen
        100s of times per second:

        Perhaps update/del throughput will increase because of the
        shared lock which would makes the patch(s) worth implementing.

        but I bet in practice that concurrency isn't necessary
        (ie the performance of a single thread resolving all buffered
        deletes is plenty fast).

        We thought the same thing about the sync in FSDirectory, and it
        turned out that in practice, NIOFSDir is an order of magnitude
        faster on *nix machines. For NRT, every little bit of
        concurrency will probably increase throughput. (i.e. most users
        will have their indexes in IO cache and/or a ram dir, which
        means we wouldn't be penalizing concurrency as we are today with
        the global lock IW for del/up docs).

        I'm going to go ahead and wrap up this patch, which will shift
        deletion cost to the del/up methods (still synchronously). Then
        create a separate patch that implements the shared lock
        solution.

        Exposing SRs for updates by the user can be done today, I'll
        open a patch for this.

        Show
        Jason Rutherglen added a comment - I want to replay how DW handle the updateDoc call to see if my understanding is correct. 1: Analyzing hits an exception for a doc, it's doc id has already been allocated so we mark it for deletion later (on flush?) in BufferedDeletes. 2: RAM Buffer writing hits an exception, we've had updates which marked deletes in current segments, however they haven't been applied yet because they're stored in BufferedDeletes docids. They're applied on successful flush. Are these the two scenarios correct or am I completely off target? If correct, isn't update doc already deleting in the foreground? prefer not to add further BG threads Maybe we can use 1.5's ReentrantReadWriteLock to effectively allow multiple del/update doc calls to concurrently acquire the read lock, and perform the deletes in the foreground. The write lock could be acquired during commitDeletes, commit(), and after a segment is flushed? I'm not sure it would be necessary to acquire this write lock anytime segment infos is changed? I think it's important to remove unnecessary global locks on unitary operations (like deletes). We've had great results removing these locks for isDeleted, (NIO)FSDirectory where we didn't think there'd be an improvement, and there was. I think this patch (or a follow on one that implements the shared lock solution) could effectively increase throughput (for deleting and updating), measurably. Lucene shouldn't aim to be able to reopen 100s of times per second Reopening after every doc could be a valid case that I suspect will come up again in the future. I don't think it's too hard to support. It's true that net latency of reopen will be reduced by being incremental, but Lucene shouldn't aim to be able to reopen 100s of times per second: Perhaps update/del throughput will increase because of the shared lock which would makes the patch(s) worth implementing. but I bet in practice that concurrency isn't necessary (ie the performance of a single thread resolving all buffered deletes is plenty fast). We thought the same thing about the sync in FSDirectory, and it turned out that in practice, NIOFSDir is an order of magnitude faster on *nix machines. For NRT, every little bit of concurrency will probably increase throughput. (i.e. most users will have their indexes in IO cache and/or a ram dir, which means we wouldn't be penalizing concurrency as we are today with the global lock IW for del/up docs). I'm going to go ahead and wrap up this patch, which will shift deletion cost to the del/up methods (still synchronously). Then create a separate patch that implements the shared lock solution. Exposing SRs for updates by the user can be done today, I'll open a patch for this.
        Hide
        Michael McCandless added a comment -

        Reopening after every doc could be a valid case that I suspect will come up again in the future.

        I suspect the vast majority of apps would be fine with 10 reopens per
        second.

        Those that really must reopen 100s of times per second can use Zoie,
        or an approach like it.

        I don't think it's too hard to support.

        Whoa! Merely thinking about and discussing even how to run proper
        tests for NRT, let alone the possible improvements to Lucene on the
        table, is sucking up all my time

        I think it's important to remove unnecessary global locks on
        unitary operations (like deletes).

        Yeah, I agree we should in general always improve our concurrency. In
        this case, resolving deletes syncs the entire IW + DW, so that blocks
        indexing new docs, launching/committing merges, flushing, etc. which
        we should fix. I just don't think NRT is really a driver for this...

        1: Analyzing hits an exception for a doc, it's doc id has
        already been allocated so we mark it for deletion later (on
        flush?) in BufferedDeletes.

        Analyzing or any other "non-aborting" exception, right.

        2: RAM Buffer writing hits an exception, we've had updates which
        marked deletes in current segments, however they haven't been
        applied yet because they're stored in BufferedDeletes docids.
        They're applied on successful flush.

        No – the deletes are buffered as Term, Query or docids, in the
        BufferedDeletes. The only case that buffers docids now is your #1
        above. On successful flush, these buffered things are moved to the
        deletesFlush (but not resolved). They are only resolved when we
        decide it's time to apply them (just before a new merge starts, or,
        when we've triggered the time-to-flush-deletes limits).

        Maybe we can use 1.5's ReentrantReadWriteLock to effectively
        allow multiple del/update doc calls to concurrently acquire the
        read lock, and perform the deletes in the foreground.

        I think that should work well?

        Show
        Michael McCandless added a comment - Reopening after every doc could be a valid case that I suspect will come up again in the future. I suspect the vast majority of apps would be fine with 10 reopens per second. Those that really must reopen 100s of times per second can use Zoie, or an approach like it. I don't think it's too hard to support. Whoa! Merely thinking about and discussing even how to run proper tests for NRT, let alone the possible improvements to Lucene on the table, is sucking up all my time I think it's important to remove unnecessary global locks on unitary operations (like deletes). Yeah, I agree we should in general always improve our concurrency. In this case, resolving deletes syncs the entire IW + DW, so that blocks indexing new docs, launching/committing merges, flushing, etc. which we should fix. I just don't think NRT is really a driver for this... 1: Analyzing hits an exception for a doc, it's doc id has already been allocated so we mark it for deletion later (on flush?) in BufferedDeletes. Analyzing or any other "non-aborting" exception, right. 2: RAM Buffer writing hits an exception, we've had updates which marked deletes in current segments, however they haven't been applied yet because they're stored in BufferedDeletes docids. They're applied on successful flush. No – the deletes are buffered as Term, Query or docids, in the BufferedDeletes. The only case that buffers docids now is your #1 above. On successful flush, these buffered things are moved to the deletesFlush (but not resolved). They are only resolved when we decide it's time to apply them (just before a new merge starts, or, when we've triggered the time-to-flush-deletes limits). Maybe we can use 1.5's ReentrantReadWriteLock to effectively allow multiple del/update doc calls to concurrently acquire the read lock, and perform the deletes in the foreground. I think that should work well?
        Hide
        Jason Rutherglen added a comment -

        resolving deletes syncs the entire IW + DW, so that
        blocks indexing new docs, launching/committing merges, flushing,
        etc... I just don't think NRT is really a driver for
        this...

        Right, I think it's a general improvement that's come to light
        during NRT development and testing. I think NRT is great in this
        regard because it stresses Lucene in a completely new way, which
        improves it for the general batch use case (i.e. users can
        simply start utilizing NRT features when they need to without
        worry).

        1: Analyzing hits an exception for a doc, it's doc id
        has already been allocated so we mark it for deletion later (on
        flush?) in BufferedDeletes.

        So there's only one use case right now, which is only when
        analyzing an individual doc fails. The update doc adds the term
        to the BufferedDeletes for later application. Makes sense. I
        think we can resolve the update doc term in the foreground. I'm
        wondering if we need a different doc id queue for these? I get
        the hunch yes, because the other doc ids need to be applied even
        on IO exception, whereas update doc id will not be applied?

        2: RAM Buffer writing hits an exception, we've had
        updates which marked deletes in current segments, however they
        haven't been applied yet because they're stored in
        BufferedDeletes docids. They're applied on successful flush.

        In essence we need to implement number 2?

        Analyzing or any other "non-aborting" exception,
        right.

        What is an example of a non-aborting exception?

        use 1.5's ReentrantReadWriteLock

        I'll incorporate RRWL into the follow on concurrent updating
        patch.

        Whoa! Merely thinking about and discussing even how to
        run proper tests for NRT, let alone the possible improvements to
        Lucene on the table, is sucking up all my time

        Yow, I didn't know. Thanks!

        Show
        Jason Rutherglen added a comment - resolving deletes syncs the entire IW + DW, so that blocks indexing new docs, launching/committing merges, flushing, etc... I just don't think NRT is really a driver for this... Right, I think it's a general improvement that's come to light during NRT development and testing. I think NRT is great in this regard because it stresses Lucene in a completely new way, which improves it for the general batch use case (i.e. users can simply start utilizing NRT features when they need to without worry). 1: Analyzing hits an exception for a doc, it's doc id has already been allocated so we mark it for deletion later (on flush?) in BufferedDeletes. So there's only one use case right now, which is only when analyzing an individual doc fails. The update doc adds the term to the BufferedDeletes for later application. Makes sense. I think we can resolve the update doc term in the foreground. I'm wondering if we need a different doc id queue for these? I get the hunch yes, because the other doc ids need to be applied even on IO exception, whereas update doc id will not be applied? 2: RAM Buffer writing hits an exception, we've had updates which marked deletes in current segments, however they haven't been applied yet because they're stored in BufferedDeletes docids. They're applied on successful flush. In essence we need to implement number 2? Analyzing or any other "non-aborting" exception, right. What is an example of a non-aborting exception? use 1.5's ReentrantReadWriteLock I'll incorporate RRWL into the follow on concurrent updating patch. Whoa! Merely thinking about and discussing even how to run proper tests for NRT, let alone the possible improvements to Lucene on the table, is sucking up all my time Yow, I didn't know. Thanks!
        Hide
        Jason Rutherglen added a comment -

        I'm browsing through the applyDeletes call path, I'm tempted to
        rework how we're doing this. For my own thinking I'd still like
        to have a queue of deletes per SR and for the ram doc buffer. I
        think this gives future flexibility and makes it really clear
        when debugging what's happening underneath. I find the remapping
        doc ids to be confusing, meaning stepping through the code would
        seem to be difficult. If we're storing doc ids alongside the SR
        the docs correspond to, there's a lot less to worry about and
        just seems clearer. This may make integrating LUCENE-1313
        directly into IW more feasible as then we're working directly at
        the SR level, and can tie the synchronization process together.
        Also this could make exposing SRs externally easier and aid in
        making IW more modular in the future?

        I can't find the code that handles aborts.

        Show
        Jason Rutherglen added a comment - I'm browsing through the applyDeletes call path, I'm tempted to rework how we're doing this. For my own thinking I'd still like to have a queue of deletes per SR and for the ram doc buffer. I think this gives future flexibility and makes it really clear when debugging what's happening underneath. I find the remapping doc ids to be confusing, meaning stepping through the code would seem to be difficult. If we're storing doc ids alongside the SR the docs correspond to, there's a lot less to worry about and just seems clearer. This may make integrating LUCENE-1313 directly into IW more feasible as then we're working directly at the SR level, and can tie the synchronization process together. Also this could make exposing SRs externally easier and aid in making IW more modular in the future? I can't find the code that handles aborts.
        Hide
        Michael McCandless added a comment -

        1: Analyzing hits an exception for a doc, it's doc id has already been allocated so we mark it for deletion later (on flush?) in BufferedDeletes.

        So there's only one use case right now, which is only when
        analyzing an individual doc fails. The update doc adds the term
        to the BufferedDeletes for later application.

        No, it adds the docID for later application. This is the one case
        (entirely internal to IW) where we delete by docID in the writer.

        I think we can resolve the update doc term in the foreground. I'm
        wondering if we need a different doc id queue for these? I get
        the hunch yes, because the other doc ids need to be applied even
        on IO exception, whereas update doc id will not be applied?

        I think we can use the same queue – whether they are applied or not
        follows exactly the same logic (ie, successful flush moves all
        deletesInRAM over to deletesFlushed), ie, an aborting exception
        cleares the deletesInRAM.

        2: RAM Buffer writing hits an exception, we've had
        updates which marked deletes in current segments, however they
        haven't been applied yet because they're stored in
        BufferedDeletes docids. They're applied on successful flush.

        In essence we need to implement number 2?

        I'm confused – #2 is already what's implemented in IW, today.

        The changes on the table now are to:

        • Immediately resolve deleted (updated) terms/queries -> docIDs
        • Change how we enqueue docIDs to be per-SR, instead.
        • But: you still must buffer Term/Query for the current RAM buffer,
          and on flushing it to a real segment, resolve them to docIDs.

        Otherwise we don't need to change what's done today (ie, keep the
        deletesInRAM vs deletesFlushed)?

        What is an example of a non-aborting exception?

        Anything hit during analysis (ie, TokenStream.incrementToken()), or
        anywhere except DocInverterPerField in the indexing chain (eg if we
        hit something when writing the stored fields, I don't think it'll
        abort).

        I'm browsing through the applyDeletes call path, I'm tempted to
        rework how we're doing this.

        I think this would be a good improvement – it would mean we don't
        need to ever remapDeletes, right?

        The thing is... this has to work in non-NRT mode, too. Somehow, you
        have to buffer such a deleted docID against the next segment to be
        flushed (ie the current RAM buffer). And once it's flushed, we move
        the docID from deletesInRAM (stored per-SR) to the SR's actual deleted
        docs BV.

        We would still keep the deletes partitioned, into those done during
        the current RAM segment vs those successfully flushed, right?

        I can't find the code that handles aborts.

        It's DW's abort() method, and eg in DocInverterPerField we call
        DW.setAborting on exception to note that this exception is an aborting
        one.

        Show
        Michael McCandless added a comment - 1: Analyzing hits an exception for a doc, it's doc id has already been allocated so we mark it for deletion later (on flush?) in BufferedDeletes. So there's only one use case right now, which is only when analyzing an individual doc fails. The update doc adds the term to the BufferedDeletes for later application. No, it adds the docID for later application. This is the one case (entirely internal to IW) where we delete by docID in the writer. I think we can resolve the update doc term in the foreground. I'm wondering if we need a different doc id queue for these? I get the hunch yes, because the other doc ids need to be applied even on IO exception, whereas update doc id will not be applied? I think we can use the same queue – whether they are applied or not follows exactly the same logic (ie, successful flush moves all deletesInRAM over to deletesFlushed), ie, an aborting exception cleares the deletesInRAM. 2: RAM Buffer writing hits an exception, we've had updates which marked deletes in current segments, however they haven't been applied yet because they're stored in BufferedDeletes docids. They're applied on successful flush. In essence we need to implement number 2? I'm confused – #2 is already what's implemented in IW, today. The changes on the table now are to: Immediately resolve deleted (updated) terms/queries -> docIDs Change how we enqueue docIDs to be per-SR, instead. But: you still must buffer Term/Query for the current RAM buffer, and on flushing it to a real segment, resolve them to docIDs. Otherwise we don't need to change what's done today (ie, keep the deletesInRAM vs deletesFlushed)? What is an example of a non-aborting exception? Anything hit during analysis (ie, TokenStream.incrementToken()), or anywhere except DocInverterPerField in the indexing chain (eg if we hit something when writing the stored fields, I don't think it'll abort). I'm browsing through the applyDeletes call path, I'm tempted to rework how we're doing this. I think this would be a good improvement – it would mean we don't need to ever remapDeletes, right? The thing is... this has to work in non-NRT mode, too. Somehow, you have to buffer such a deleted docID against the next segment to be flushed (ie the current RAM buffer). And once it's flushed, we move the docID from deletesInRAM (stored per-SR) to the SR's actual deleted docs BV. We would still keep the deletes partitioned, into those done during the current RAM segment vs those successfully flushed, right? I can't find the code that handles aborts. It's DW's abort() method, and eg in DocInverterPerField we call DW.setAborting on exception to note that this exception is an aborting one.
        Hide
        Jason Rutherglen added a comment -
        • There's pending deletes (aka updateDoc generated deletes) per
          SR. They're stored in a pending deletes BV in SR.
        • commitMergedDeletes maps the pending deletes into the
          mergeReader.
        • DW.abort clears the pending deletes from all SRs.
        • On successful flush, the SR pending deletes are merged into the
          primary del docs BV.
        • Deletes are still buffered, however they're only applied to the
          newly flushed segment (rather than all readers). If the applying
          fails, I think we need to keep some of the rollback from the
          original applyDeletes?
        • The foreground deleting seems to break a couple of tests in
          TestIndexWriter.

        Mike, you mentioned testing getReader missing deletes etc (in
        response to potential file handle leakage), which test or
        benchmark did you use for this?

        Show
        Jason Rutherglen added a comment - There's pending deletes (aka updateDoc generated deletes) per SR. They're stored in a pending deletes BV in SR. commitMergedDeletes maps the pending deletes into the mergeReader. DW.abort clears the pending deletes from all SRs. On successful flush, the SR pending deletes are merged into the primary del docs BV. Deletes are still buffered, however they're only applied to the newly flushed segment (rather than all readers). If the applying fails, I think we need to keep some of the rollback from the original applyDeletes? The foreground deleting seems to break a couple of tests in TestIndexWriter. Mike, you mentioned testing getReader missing deletes etc (in response to potential file handle leakage), which test or benchmark did you use for this?
        Hide
        Jason Rutherglen added a comment -

        In the updateDocument and deleteDocument methods, deletes are
        buffered per segment reader synchronized on writer. Immediately
        after, outside the sync block, they're deleted from the existing
        SRs. If a new SR is added, it's because of a flush (which has
        it's own buffered deletes), or from a merge which is
        synchronized on writer. In other words, we won't lose deletes,
        they'll always be applied on a flush, and the resolution of
        deletes to doc ids happens un-synchronized on writer.

        Update document adds the term to the queued deletes, then
        resolves and adds the doc ids to an Integer list (for now). This
        is where we may want to use an growable int[] or int set.

        Flush applies queued update doc deleted doc ids to the SRs.

        commitMerge merges queued deletes from the incoming SRs. Doc ids
        are mapped to the merged reader.

        Show
        Jason Rutherglen added a comment - In the updateDocument and deleteDocument methods, deletes are buffered per segment reader synchronized on writer. Immediately after, outside the sync block, they're deleted from the existing SRs. If a new SR is added, it's because of a flush (which has it's own buffered deletes), or from a merge which is synchronized on writer. In other words, we won't lose deletes, they'll always be applied on a flush, and the resolution of deletes to doc ids happens un-synchronized on writer. Update document adds the term to the queued deletes, then resolves and adds the doc ids to an Integer list (for now). This is where we may want to use an growable int[] or int set. Flush applies queued update doc deleted doc ids to the SRs. commitMerge merges queued deletes from the incoming SRs. Doc ids are mapped to the merged reader.
        Hide
        Jason Rutherglen added a comment -

        When docWriter aborts on the RAM buffer, we clear out the queued updateDoc deletes.

        Show
        Jason Rutherglen added a comment - When docWriter aborts on the RAM buffer, we clear out the queued updateDoc deletes.
        Hide
        Jason Rutherglen added a comment -

        DocWriter abort created a deadlock noticed in TestIndexWriter.testIOExceptionDuringAbortWithThreads. This is fixed by clearing via the reader pool. Other tests fail in TIW.

        Show
        Jason Rutherglen added a comment - DocWriter abort created a deadlock noticed in TestIndexWriter.testIOExceptionDuringAbortWithThreads. This is fixed by clearing via the reader pool. Other tests fail in TIW.
        Hide
        Jason Rutherglen added a comment -

        TestIndexWriter passes, mostly due to removing assertions in
        reader pool release which presumed deletions would be
        synchronized with closing a reader. They're not anymore.
        Deletions can come in at almost anytime, so a reader may be
        closed by the pool while still carrying deletes. The releasing
        thread may not be synchronized on writer because we're allowing
        deletions to occur un-synchronized.

        I suppose we need more tests to insure the assertions are in
        fact not needed.

        Show
        Jason Rutherglen added a comment - TestIndexWriter passes, mostly due to removing assertions in reader pool release which presumed deletions would be synchronized with closing a reader. They're not anymore. Deletions can come in at almost anytime, so a reader may be closed by the pool while still carrying deletes. The releasing thread may not be synchronized on writer because we're allowing deletions to occur un-synchronized. I suppose we need more tests to insure the assertions are in fact not needed.
        Hide
        Steve Rowe added a comment -

        Bulk move 4.4 issues to 4.5 and 5.0

        Show
        Steve Rowe added a comment - Bulk move 4.4 issues to 4.5 and 5.0
        Hide
        Uwe Schindler added a comment -

        Move issue to Lucene 4.9.

        Show
        Uwe Schindler added a comment - Move issue to Lucene 4.9.

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Michael McCandless
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:

              Development