Lucene - Core
  1. Lucene - Core
  2. LUCENE-938

I/O exceptions can cause loss of buffered deletes

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.3
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      Some I/O exceptions that result in segmentInfos rollback operations can cause buffered deletes that existed before the rollback creation point to be incorrectly lost when the IOException triggers a rollback.

      1. LUCENE-938.patch.txt
        10 kB
        Steven Parkes
      2. LUCENE-938.take2.patch
        17 kB
        Michael McCandless
      3. LUCENE-938.txt
        18 kB
        Steven Parkes
      4. LUCENE-938.txt
        18 kB
        Steven Parkes

        Activity

        Hide
        Steven Parkes added a comment -

        Patch that fixes the two relevant rollback mechanisms in IndexWriter: the rollback support in mergeSegments around maybeApplyDeletes and the rollback support in the *Transaction methods. For the later, I had to promote the transaction routines from private to package protected so that I could write the tests, which are also in the patch.

        Show
        Steven Parkes added a comment - Patch that fixes the two relevant rollback mechanisms in IndexWriter: the rollback support in mergeSegments around maybeApplyDeletes and the rollback support in the *Transaction methods. For the later, I had to promote the transaction routines from private to package protected so that I could write the tests, which are also in the patch.
        Hide
        Michael McCandless added a comment -

        Wow, good catch and nice unit test!

        Show
        Michael McCandless added a comment - Wow, good catch and nice unit test!
        Hide
        Steven Parkes added a comment -

        Only it's broke. Mixing a couple of things, I missed a couple of tests that I broke. That's what I get for coding with a cold. Another fix soon.

        And I was going to mention that I added support in the mock directory stuff for injecting deterministic I/O errors. Should help us capture tricky corner cases where we need to.

        Show
        Steven Parkes added a comment - Only it's broke. Mixing a couple of things, I missed a couple of tests that I broke. That's what I get for coding with a cold. Another fix soon. And I was going to mention that I added support in the mock directory stuff for injecting deterministic I/O errors. Should help us capture tricky corner cases where we need to.
        Hide
        Steven Parkes added a comment -

        This version has the missing fixes that got tossed when I tried to clean up that patch.

        Show
        Steven Parkes added a comment - This version has the missing fixes that got tossed when I tried to clean up that patch.
        Hide
        Michael McCandless added a comment -

        Steve, I re-worked this patch now that LUCENE-843 is committed (it
        conflicted) and am attaching it (LUCENE-938.take2.patch). It's the
        same logic as before. If it looks good to you I can commit it!

        Show
        Michael McCandless added a comment - Steve, I re-worked this patch now that LUCENE-843 is committed (it conflicted) and am attaching it ( LUCENE-938 .take2.patch). It's the same logic as before. If it looks good to you I can commit it!
        Hide
        Steven Parkes added a comment -

        Thanks. I figured there'd be conflicts.

        I won't be able to look at this until next week ...

        Show
        Steven Parkes added a comment - Thanks. I figured there'd be conflicts. I won't be able to look at this until next week ...
        Hide
        Ning Li added a comment -

        Good catch, Steven!

        One thing though: I thought we had assumed that there wouldn't be any buffered docs or delete terms when startTransaction(), so no local copies are necessary. That means no change to startTransaction() and rollbackTransaction(). If there could be buffered docs and delete terms when startTransaction(), then local copies should be made for buffered docs and localNumBufferedDeleteTerms should clone numBufferedDeleteTerms instead of just copying the reference.

        Show
        Ning Li added a comment - Good catch, Steven! One thing though: I thought we had assumed that there wouldn't be any buffered docs or delete terms when startTransaction(), so no local copies are necessary. That means no change to startTransaction() and rollbackTransaction(). If there could be buffered docs and delete terms when startTransaction(), then local copies should be made for buffered docs and localNumBufferedDeleteTerms should clone numBufferedDeleteTerms instead of just copying the reference.
        Hide
        Steven Parkes added a comment -

        Easy first: there's a comment in the code about cloning the buf delete term hash with a clear vs. copying the reference and creating a new hash. I've gone back and forth. I don't have a strong opinion.

        I did notice that every call to startTransaction had flushed the deletes ahead but I didn't see that it was required to be so. I can go either way on this, too. I'd vote for making startTransaction safer. I think, in theory, it could be opened to users at some point, if it were to help people trying to use Lucene in certain transactional contexts.

        Of course, that violates YAGNI.

        But I hadn't looked at the ram segments. Does look like if there are any at the start of a trans. and they get flushed, they would/might get lost? Since that touches on the index file deleter, it strikes me as more complex.

        Show
        Steven Parkes added a comment - Easy first: there's a comment in the code about cloning the buf delete term hash with a clear vs. copying the reference and creating a new hash. I've gone back and forth. I don't have a strong opinion. I did notice that every call to startTransaction had flushed the deletes ahead but I didn't see that it was required to be so. I can go either way on this, too. I'd vote for making startTransaction safer. I think, in theory, it could be opened to users at some point, if it were to help people trying to use Lucene in certain transactional contexts. Of course, that violates YAGNI. But I hadn't looked at the ram segments. Does look like if there are any at the start of a trans. and they get flushed, they would/might get lost? Since that touches on the index file deleter, it strikes me as more complex.
        Hide
        Ning Li added a comment -

        I didn't make myself clear. Let me try again. The patch includes two parts of changes to IndexWriter: one adds localNumBufferedDeleteTerms and localBufferedDeleteTerms and uses them in startTransaction() and rollbackTransaction(); the other fixes loss of buffered deletes in flush() (and applyDeletes() which is used by flush()).

        The second part is good and that's where you had the comment on cloning.

        I was referring to the first part. In startTransaction(), "localBufferedDeleteTerms = bufferedDeleteTerms" reference-copies bufferedDeleteTerms. Then more delete terms are buffered into bufferedDeleteTerms... so localBufferedDeleteTerms would have the delete terms buffered between startTransaction() and the first flush()...

        Show
        Ning Li added a comment - I didn't make myself clear. Let me try again. The patch includes two parts of changes to IndexWriter: one adds localNumBufferedDeleteTerms and localBufferedDeleteTerms and uses them in startTransaction() and rollbackTransaction(); the other fixes loss of buffered deletes in flush() (and applyDeletes() which is used by flush()). The second part is good and that's where you had the comment on cloning. I was referring to the first part. In startTransaction(), "localBufferedDeleteTerms = bufferedDeleteTerms" reference-copies bufferedDeleteTerms. Then more delete terms are buffered into bufferedDeleteTerms... so localBufferedDeleteTerms would have the delete terms buffered between startTransaction() and the first flush()...
        Hide
        Steven Parkes added a comment -

        Okay. Got it.

        But your earlier note got me thinking. Mike, as far as I can tell, the 843 buffered docs stuff isn't getting restored around a transaction? Or am I missing something? Are you assuming flush is called before startTransaction?

        Show
        Steven Parkes added a comment - Okay. Got it. But your earlier note got me thinking. Mike, as far as I can tell, the 843 buffered docs stuff isn't getting restored around a transaction? Or am I missing something? Are you assuming flush is called before startTransaction?
        Hide
        Michael McCandless added a comment -

        Ahh, right, we are not protecting buffered doc state inside the local
        transaction. And actually doing so would not be very easy.

        I would say we require that there are no buffered docs nor deletes
        when startTransaction() is called. The code guarantees that now but
        how about we doc this limitation and put an assert in there to make
        sure?

        These methods were added only for protecting the index during the
        addIndexes* calls; if in the future we somehow want to make them more
        powerful I think we can fix it then...

        Show
        Michael McCandless added a comment - Ahh, right, we are not protecting buffered doc state inside the local transaction. And actually doing so would not be very easy. I would say we require that there are no buffered docs nor deletes when startTransaction() is called. The code guarantees that now but how about we doc this limitation and put an assert in there to make sure? These methods were added only for protecting the index during the addIndexes* calls; if in the future we somehow want to make them more powerful I think we can fix it then...
        Hide
        Steven Parkes added a comment -

        Works for me. I'll submit a new patch.

        Show
        Steven Parkes added a comment - Works for me. I'll submit a new patch.
        Hide
        Steven Parkes added a comment -

        New patch. Removes support for buf deletes around transactions but documents and asserts this.

        Fixes a few typos in the comments.

        Show
        Steven Parkes added a comment - New patch. Removes support for buf deletes around transactions but documents and asserts this. Fixes a few typos in the comments.
        Hide
        Michael McCandless added a comment -

        This patch looks great! Thanks Steve. I'll commit it shortly.

        Show
        Michael McCandless added a comment - This patch looks great! Thanks Steve. I'll commit it shortly.

          People

          • Assignee:
            Steven Parkes
            Reporter:
            Steven Parkes
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development