Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-1335

Correctly handle concurrent calls to addIndexes, optimize, commit

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.3, 2.3.1
    • Fix Version/s: 2.4
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New
    1. LUCENE-1335.patch
      50 kB
      Michael McCandless
    2. LUCENE-1335.patch
      47 kB
      Michael McCandless
    3. LUCENE-1335.patch
      44 kB
      Michael McCandless
    4. LUCENE-1335.patch
      42 kB
      Michael McCandless
    5. LUCENE-1335.patch
      41 kB
      Michael McCandless

      Activity

      Hide
      mikemccand Michael McCandless added a comment -

      Attached patch. Here're the changes:

      • Only one addIndexes can run at once, so call to 2nd or more
        addIndexes just blocks until the one is done.
      • close() and rollback() wait for any running addIndexes to finish
        and then blocks new addIndexes calls
      • commit() waits for any running addIndexes, or any already running
        commit, to finish, then quickly takes a snapshot of the segments
        and syncs the files referenced by that snapshot. While syncing is
        happening addIndexes are then allowed to run again.
      • optimize() is allowed to run concurrently with addIndexes; the two
        simply wait for their respective merges to finish.

      I think we should not make any API promises about what it means when
      these methods (commit, close, rollback, optimize, addIndexes) are
      called concurrently from different threads, except that the methods
      all work correctly, IndexWriter won't throw an errant exception, and
      your index won't become corrupt.

      I made one additional change which is technically a break in backwards
      compatibility, but I think a minor acceptable one: I no longer
      allow the same Directory to be passed into addIndexes* more than once.
      This was necessary because we identify a SegmentInfo by its
      Directory/name pair, and passing in the same Directory allowed dup
      SegmentInfo instances to enter SegmentInfos, which is dangerous.

      I'll wait a few days before committing.

      Show
      mikemccand Michael McCandless added a comment - Attached patch. Here're the changes: Only one addIndexes can run at once, so call to 2nd or more addIndexes just blocks until the one is done. close() and rollback() wait for any running addIndexes to finish and then blocks new addIndexes calls commit() waits for any running addIndexes, or any already running commit, to finish, then quickly takes a snapshot of the segments and syncs the files referenced by that snapshot. While syncing is happening addIndexes are then allowed to run again. optimize() is allowed to run concurrently with addIndexes; the two simply wait for their respective merges to finish. I think we should not make any API promises about what it means when these methods (commit, close, rollback, optimize, addIndexes) are called concurrently from different threads, except that the methods all work correctly, IndexWriter won't throw an errant exception, and your index won't become corrupt. I made one additional change which is technically a break in backwards compatibility, but I think a minor acceptable one: I no longer allow the same Directory to be passed into addIndexes* more than once. This was necessary because we identify a SegmentInfo by its Directory/name pair, and passing in the same Directory allowed dup SegmentInfo instances to enter SegmentInfos, which is dangerous. I'll wait a few days before committing.
      Hide
      yseeley@gmail.com Yonik Seeley added a comment -

      I think we should not make any API promises about what it means when
      these methods (commit, close, rollback, optimize, addIndexes) are
      called concurrently from different threads, except that the methods
      all work correctly, IndexWriter won't throw an errant exception, and
      your index won't become corrupt.

      Agree... higher level synchronization by the user is the right way to ensure/enforce an ordering.

      Show
      yseeley@gmail.com Yonik Seeley added a comment - I think we should not make any API promises about what it means when these methods (commit, close, rollback, optimize, addIndexes) are called concurrently from different threads, except that the methods all work correctly, IndexWriter won't throw an errant exception, and your index won't become corrupt. Agree... higher level synchronization by the user is the right way to ensure/enforce an ordering.
      Hide
      yseeley@gmail.com Yonik Seeley added a comment -

      I've just started reviewing this patch.
      Since doWait() can return after 1 sec, the pattern is to use a while loop with the condition that caused it to be called.
      But in some cases, it's hard to tell if the code is correct.... for example copyExternalSegments() is hard because of the other non-trival code code in the loop where it's not clear if it's safe/correct to call that code again. Perhaps registerMerge() detects the conflict with another merge with the same segments and that's what keeps things correct? Comments to the effect of why it's OK to run certain code more than once would be very welcome.

      Show
      yseeley@gmail.com Yonik Seeley added a comment - I've just started reviewing this patch. Since doWait() can return after 1 sec, the pattern is to use a while loop with the condition that caused it to be called. But in some cases, it's hard to tell if the code is correct.... for example copyExternalSegments() is hard because of the other non-trival code code in the loop where it's not clear if it's safe/correct to call that code again. Perhaps registerMerge() detects the conflict with another merge with the same segments and that's what keeps things correct? Comments to the effect of why it's OK to run certain code more than once would be very welcome.
      Hide
      mikemccand Michael McCandless added a comment -

      Thanks, Yonik. I'll add a comment there, and any other places where I call doWait that look similarly confusing...

      Show
      mikemccand Michael McCandless added a comment - Thanks, Yonik. I'll add a comment there, and any other places where I call doWait that look similarly confusing...
      Hide
      mikemccand Michael McCandless added a comment -

      Improved comments in expungeDeletes & copyExternalSegments.

      Show
      mikemccand Michael McCandless added a comment - Improved comments in expungeDeletes & copyExternalSegments.
      Hide
      mikemccand Michael McCandless added a comment -

      Yonik, any more feedback on this patch?

      Show
      mikemccand Michael McCandless added a comment - Yonik, any more feedback on this patch?
      Hide
      mikemccand Michael McCandless added a comment -

      I plan to commit this in a day or two.

      Show
      mikemccand Michael McCandless added a comment - I plan to commit this in a day or two.
      Hide
      ningli Ning Li added a comment -

      Hi Mike, could you update the patch? I cannot apply it. Thanks!

      Show
      ningli Ning Li added a comment - Hi Mike, could you update the patch? I cannot apply it. Thanks!
      Hide
      mikemccand Michael McCandless added a comment -

      OK, attached refreshed patch to trunk.

      Show
      mikemccand Michael McCandless added a comment - OK, attached refreshed patch to trunk.
      Hide
      ningli Ning Li added a comment -

      I agree that we should not make any API promises about what
      it means when the methods (commit, close, rollback, optimize,
      addIndexes) are called concurrently from different threads.
      The discussion below is on their current behavior.

      > Only one addIndexes can run at once, so call to 2nd or more
      > addIndexes just blocks until the one is done.

      This is achieved by the read-write lock.

      > close() and rollback() wait for any running addIndexes to finish
      > and then blocks new addIndexes calls

      Just to clarify: close(waitForMerges=false) and rollback() make
      an ongoing addIndexes[NoOptimize](dirs) abort, but wait for
      addIndexes(readers) to finish. It'd be nice if they make any
      addIndexes* abort for a quick shutdown, but that's for later.

      > commit() waits for any running addIndexes, or any already running
      > commit, to finish, then quickly takes a snapshot of the segments
      > and syncs the files referenced by that snapshot. While syncing is
      > happening addIndexes are then allowed to run again.

      commit() and commit(long) use the read-write lock to wait for
      a running addIndexes. "committing" is used to serialize commit()
      calls. Why isn't it also used to serialize commit(long) calls?

      > optimize() is allowed to run concurrently with addIndexes; the two
      > simply wait for their respective merges to finish.

      This is nice.

      More detailed comments:

      • In finishMerges, acquireRead and releaseRead are both called.
        Isn't addIndexes allowed again?
      • In copyExternalSegments, merges involving external segments
        are carried out in foreground. So why the changes? To relax
        that assumption? But other part still makes the assumption.
      • addIndexes(readers) should optimize before startTransaction, no?
      • The newly added method segString(dir) in SegmentInfos is
        not used anywhere.
      Show
      ningli Ning Li added a comment - I agree that we should not make any API promises about what it means when the methods (commit, close, rollback, optimize, addIndexes) are called concurrently from different threads. The discussion below is on their current behavior. > Only one addIndexes can run at once, so call to 2nd or more > addIndexes just blocks until the one is done. This is achieved by the read-write lock. > close() and rollback() wait for any running addIndexes to finish > and then blocks new addIndexes calls Just to clarify: close(waitForMerges=false) and rollback() make an ongoing addIndexes [NoOptimize] (dirs) abort, but wait for addIndexes(readers) to finish. It'd be nice if they make any addIndexes* abort for a quick shutdown, but that's for later. > commit() waits for any running addIndexes, or any already running > commit, to finish, then quickly takes a snapshot of the segments > and syncs the files referenced by that snapshot. While syncing is > happening addIndexes are then allowed to run again. commit() and commit(long) use the read-write lock to wait for a running addIndexes. "committing" is used to serialize commit() calls. Why isn't it also used to serialize commit(long) calls? > optimize() is allowed to run concurrently with addIndexes; the two > simply wait for their respective merges to finish. This is nice. More detailed comments: In finishMerges, acquireRead and releaseRead are both called. Isn't addIndexes allowed again? In copyExternalSegments, merges involving external segments are carried out in foreground. So why the changes? To relax that assumption? But other part still makes the assumption. addIndexes(readers) should optimize before startTransaction, no? The newly added method segString(dir) in SegmentInfos is not used anywhere.
      Hide
      mikemccand Michael McCandless added a comment -

      Just to clarify: close(waitForMerges=false) and rollback() make
      an ongoing addIndexes[NoOptimize](dirs) abort, but wait for
      addIndexes(readers) to finish. It'd be nice if they make any
      addIndexes* abort for a quick shutdown, but that's for later.

      True, agreed.

      commit() and commit(long) use the read-write lock to wait for
      a running addIndexes. "committing" is used to serialize commit()
      calls. Why isn't it also used to serialize commit(long) calls?

      It's because commit() calls prepareCommit(), which throws a
      "prepareCommit was already called" exception if the commit was already
      prepared. Whereas commit(long) doesn't call prepareCommit (eg, it
      doesn't need to flush). Without this, I was hitting exceptions in one
      of the tests that calls commit() from multiple threads at the same
      time.

      • In finishMerges, acquireRead and releaseRead are both called.
        Isn't addIndexes allowed again?

      This is to make sure any just-started addIndexes cleanly finish or
      abort before we enter the wait loop. I was seeing cases where the
      wait loop would think no more merges were pending, but in fact an
      addIndexes was just getting underway and was about to start merging.
      It's OK if a new addIndexes call starts up, because it'll be forced to
      check the stop conditions (closing=true or stopMerges=true) and then
      abort the merges. I'll add comments to this effect.

      • In copyExternalSegments, merges involving external segments
        are carried out in foreground. So why the changes? To relax
        that assumption? But other part still makes the assumption.

      This method has always carried out merges in the FG, but it's in fact
      possible that a BG merge thread on finishing a previous merge may pull
      a merge involving external segments. So I changed this method to wait
      for all such BG merges to complete, because it's not allowed to return
      until there are no more external segments in the index.

      It is tempting to fully schedule these external merges (ie allow them
      to run in BG), but there is a problem: if there is some error on doing
      the merge, we need that error to be thrown in the FG thread calling
      copyExternalSegments (so the transcaction above unwinds). (Ie we
      can't just stuff these external merges into the merge queue then wait
      for their completely). So I think we need to leave is as is?

      • addIndexes(readers) should optimize before startTransaction, no?

      I had to move the optimize() inside the transaction because it could
      happen that after the optimize() is finished, some other thread sneaks
      in a call to addIndexes* and gets additional segments added to the
      index such that by the time we start the transaction we now have more
      than one segment.

      But this change will tie up more disk space than addIndexes used to
      (since it will also rollback the optimize on hitting an exception).
      Really I just need to pre-acquire the write lock, then I can leave
      optimize() out of the transaction. I'll do that.

      • The newly added method segString(dir) in SegmentInfos is
        not used anywhere.

      Yeah I was using this for internal debugging, and I think it's
      generally useful for future debugging, so I left it in.

      Show
      mikemccand Michael McCandless added a comment - Just to clarify: close(waitForMerges=false) and rollback() make an ongoing addIndexes [NoOptimize] (dirs) abort, but wait for addIndexes(readers) to finish. It'd be nice if they make any addIndexes* abort for a quick shutdown, but that's for later. True, agreed. commit() and commit(long) use the read-write lock to wait for a running addIndexes. "committing" is used to serialize commit() calls. Why isn't it also used to serialize commit(long) calls? It's because commit() calls prepareCommit(), which throws a "prepareCommit was already called" exception if the commit was already prepared. Whereas commit(long) doesn't call prepareCommit (eg, it doesn't need to flush). Without this, I was hitting exceptions in one of the tests that calls commit() from multiple threads at the same time. In finishMerges, acquireRead and releaseRead are both called. Isn't addIndexes allowed again? This is to make sure any just-started addIndexes cleanly finish or abort before we enter the wait loop. I was seeing cases where the wait loop would think no more merges were pending, but in fact an addIndexes was just getting underway and was about to start merging. It's OK if a new addIndexes call starts up, because it'll be forced to check the stop conditions (closing=true or stopMerges=true) and then abort the merges. I'll add comments to this effect. In copyExternalSegments, merges involving external segments are carried out in foreground. So why the changes? To relax that assumption? But other part still makes the assumption. This method has always carried out merges in the FG, but it's in fact possible that a BG merge thread on finishing a previous merge may pull a merge involving external segments. So I changed this method to wait for all such BG merges to complete, because it's not allowed to return until there are no more external segments in the index. It is tempting to fully schedule these external merges (ie allow them to run in BG), but there is a problem: if there is some error on doing the merge, we need that error to be thrown in the FG thread calling copyExternalSegments (so the transcaction above unwinds). (Ie we can't just stuff these external merges into the merge queue then wait for their completely). So I think we need to leave is as is? addIndexes(readers) should optimize before startTransaction, no? I had to move the optimize() inside the transaction because it could happen that after the optimize() is finished, some other thread sneaks in a call to addIndexes* and gets additional segments added to the index such that by the time we start the transaction we now have more than one segment. But this change will tie up more disk space than addIndexes used to (since it will also rollback the optimize on hitting an exception). Really I just need to pre-acquire the write lock, then I can leave optimize() out of the transaction. I'll do that. The newly added method segString(dir) in SegmentInfos is not used anywhere. Yeah I was using this for internal debugging, and I think it's generally useful for future debugging, so I left it in.
      Hide
      mikemccand Michael McCandless added a comment -

      New patch incorporating Ning's comments (thanks Ning!).

      Show
      mikemccand Michael McCandless added a comment - New patch incorporating Ning's comments (thanks Ning!).
      Hide
      ningli Ning Li added a comment -

      > It's because commit() calls prepareCommit(), which throws a
      > "prepareCommit was already called" exception if the commit was already
      > prepared. Whereas commit(long) doesn't call prepareCommit (eg, it
      > doesn't need to flush). Without this, I was hitting exceptions in one
      > of the tests that calls commit() from multiple threads at the same
      > time.

      Is it better to simplify things by serializing all commit()/commit(long) calls?

      > This is to make sure any just-started addIndexes cleanly finish or
      > abort before we enter the wait loop. I was seeing cases where the
      > wait loop would think no more merges were pending, but in fact an
      > addIndexes was just getting underway and was about to start merging.
      > It's OK if a new addIndexes call starts up, because it'll be forced to
      > check the stop conditions (closing=true or stopMerges=true) and then
      > abort the merges. I'll add comments to this effect.

      I wonder if we can simplify the logic... Currently in setMergeScheduler,
      merges can start between finishMerges and set the merge scheduler.
      This one can be fixed by making setMergeScheduler synchronized.

      > This method has always carried out merges in the FG, but it's in fact
      > possible that a BG merge thread on finishing a previous merge may pull
      > a merge involving external segments. So I changed this method to wait
      > for all such BG merges to complete, because it's not allowed to return
      > until there are no more external segments in the index.

      Hmm... so merges involving external segments may be in FG or BG?
      So copyExternalSegments not only copies external segments, but also
      waits for BG merges involving external segments to finish. We need
      a better name?

      > It is tempting to fully schedule these external merges (ie allow them
      > to run in BG), but there is a problem: if there is some error on doing
      > the merge, we need that error to be thrown in the FG thread calling
      > copyExternalSegments (so the transcaction above unwinds). (Ie we
      > can't just stuff these external merges into the merge queue then wait
      > for their completely).

      Then what about those BG merges involving external segments?

      Show
      ningli Ning Li added a comment - > It's because commit() calls prepareCommit(), which throws a > "prepareCommit was already called" exception if the commit was already > prepared. Whereas commit(long) doesn't call prepareCommit (eg, it > doesn't need to flush). Without this, I was hitting exceptions in one > of the tests that calls commit() from multiple threads at the same > time. Is it better to simplify things by serializing all commit()/commit(long) calls? > This is to make sure any just-started addIndexes cleanly finish or > abort before we enter the wait loop. I was seeing cases where the > wait loop would think no more merges were pending, but in fact an > addIndexes was just getting underway and was about to start merging. > It's OK if a new addIndexes call starts up, because it'll be forced to > check the stop conditions (closing=true or stopMerges=true) and then > abort the merges. I'll add comments to this effect. I wonder if we can simplify the logic... Currently in setMergeScheduler, merges can start between finishMerges and set the merge scheduler. This one can be fixed by making setMergeScheduler synchronized. > This method has always carried out merges in the FG, but it's in fact > possible that a BG merge thread on finishing a previous merge may pull > a merge involving external segments. So I changed this method to wait > for all such BG merges to complete, because it's not allowed to return > until there are no more external segments in the index. Hmm... so merges involving external segments may be in FG or BG? So copyExternalSegments not only copies external segments, but also waits for BG merges involving external segments to finish. We need a better name? > It is tempting to fully schedule these external merges (ie allow them > to run in BG), but there is a problem: if there is some error on doing > the merge, we need that error to be thrown in the FG thread calling > copyExternalSegments (so the transcaction above unwinds). (Ie we > can't just stuff these external merges into the merge queue then wait > for their completely). Then what about those BG merges involving external segments?
      Hide
      mikemccand Michael McCandless added a comment -

      > It's because commit() calls prepareCommit(), which throws a
      > "prepareCommit was already called" exception if the commit was already
      > prepared. Whereas commit(long) doesn't call prepareCommit (eg, it
      > doesn't need to flush). Without this, I was hitting exceptions in one
      > of the tests that calls commit() from multiple threads at the same
      > time.

      Is it better to simplify things by serializing all commit()/commit(long) calls?

      I don't think so: with autoCommit=true, the merges calls commit(long)
      after finishing, and I think we want those commit calls to run
      concurrently?

      > This is to make sure any just-started addIndexes cleanly finish or
      > abort before we enter the wait loop. I was seeing cases where the
      > wait loop would think no more merges were pending, but in fact an
      > addIndexes was just getting underway and was about to start merging.
      > It's OK if a new addIndexes call starts up, because it'll be forced to
      > check the stop conditions (closing=true or stopMerges=true) and then
      > abort the merges. I'll add comments to this effect.

      I wonder if we can simplify the logic... Currently in setMergeScheduler,
      merges can start between finishMerges and set the merge scheduler.
      This one can be fixed by making setMergeScheduler synchronized.

      Good catch – I'll make setMergeScheduler synchronized.

      > This method has always carried out merges in the FG, but it's in fact
      > possible that a BG merge thread on finishing a previous merge may pull
      > a merge involving external segments. So I changed this method to wait
      > for all such BG merges to complete, because it's not allowed to return
      > until there are no more external segments in the index.

      Hmm... so merges involving external segments may be in FG or BG?
      So copyExternalSegments not only copies external segments, but also
      waits for BG merges involving external segments to finish. We need
      a better name?

      Sure we can change the name – do you have one in mind? Maybe
      "resolveExternalSegments" or "waitForExternalSegments"?

      > It is tempting to fully schedule these external merges (ie allow them
      > to run in BG), but there is a problem: if there is some error on doing
      > the merge, we need that error to be thrown in the FG thread calling
      > copyExternalSegments (so the transcaction above unwinds). (Ie we
      > can't just stuff these external merges into the merge queue then wait
      > for their completely).

      Then what about those BG merges involving external segments?

      What'll happen is the BG merge will hit an exception, roll itself
      back, and then the FG thread will pick up the merge and try again.
      Likely it'll hit the same exception, which is then thrown back to the
      caller. It may not hit an exception, eg say it was disk full: the BG
      merge was probably trying to merge 10 segments, whereas the FG merge
      is just copying over the 1 segment. So it may complete successfully
      too.

      Show
      mikemccand Michael McCandless added a comment - > It's because commit() calls prepareCommit(), which throws a > "prepareCommit was already called" exception if the commit was already > prepared. Whereas commit(long) doesn't call prepareCommit (eg, it > doesn't need to flush). Without this, I was hitting exceptions in one > of the tests that calls commit() from multiple threads at the same > time. Is it better to simplify things by serializing all commit()/commit(long) calls? I don't think so: with autoCommit=true, the merges calls commit(long) after finishing, and I think we want those commit calls to run concurrently? > This is to make sure any just-started addIndexes cleanly finish or > abort before we enter the wait loop. I was seeing cases where the > wait loop would think no more merges were pending, but in fact an > addIndexes was just getting underway and was about to start merging. > It's OK if a new addIndexes call starts up, because it'll be forced to > check the stop conditions (closing=true or stopMerges=true) and then > abort the merges. I'll add comments to this effect. I wonder if we can simplify the logic... Currently in setMergeScheduler, merges can start between finishMerges and set the merge scheduler. This one can be fixed by making setMergeScheduler synchronized. Good catch – I'll make setMergeScheduler synchronized. > This method has always carried out merges in the FG, but it's in fact > possible that a BG merge thread on finishing a previous merge may pull > a merge involving external segments. So I changed this method to wait > for all such BG merges to complete, because it's not allowed to return > until there are no more external segments in the index. Hmm... so merges involving external segments may be in FG or BG? So copyExternalSegments not only copies external segments, but also waits for BG merges involving external segments to finish. We need a better name? Sure we can change the name – do you have one in mind? Maybe "resolveExternalSegments" or "waitForExternalSegments"? > It is tempting to fully schedule these external merges (ie allow them > to run in BG), but there is a problem: if there is some error on doing > the merge, we need that error to be thrown in the FG thread calling > copyExternalSegments (so the transcaction above unwinds). (Ie we > can't just stuff these external merges into the merge queue then wait > for their completely). Then what about those BG merges involving external segments? What'll happen is the BG merge will hit an exception, roll itself back, and then the FG thread will pick up the merge and try again. Likely it'll hit the same exception, which is then thrown back to the caller. It may not hit an exception, eg say it was disk full: the BG merge was probably trying to merge 10 segments, whereas the FG merge is just copying over the 1 segment. So it may complete successfully too.
      Hide
      ningli Ning Li added a comment -

      > I don't think so: with autoCommit=true, the merges calls commit(long)
      > after finishing, and I think we want those commit calls to run
      > concurrently?

      After we disable autoCommit, all commit calls will be serialized, right?

      > What'll happen is the BG merge will hit an exception, roll itself
      > back, and then the FG thread will pick up the merge and try again.
      > Likely it'll hit the same exception, which is then thrown back to the
      > caller. It may not hit an exception, eg say it was disk full: the BG
      > merge was probably trying to merge 10 segments, whereas the FG merge
      > is just copying over the 1 segment. So it may complete successfully
      > too.

      Back to the issue of running an external merge in BG or FG.
      In ConcurrentMergeScheduler.merge, an external merge is run in FG,
      not in BG. But in ConcurrentMergeScheduler.MergeThread.run,
      whether a merge is external is no longer checked. Why this difference?

      Show
      ningli Ning Li added a comment - > I don't think so: with autoCommit=true, the merges calls commit(long) > after finishing, and I think we want those commit calls to run > concurrently? After we disable autoCommit, all commit calls will be serialized, right? > What'll happen is the BG merge will hit an exception, roll itself > back, and then the FG thread will pick up the merge and try again. > Likely it'll hit the same exception, which is then thrown back to the > caller. It may not hit an exception, eg say it was disk full: the BG > merge was probably trying to merge 10 segments, whereas the FG merge > is just copying over the 1 segment. So it may complete successfully > too. Back to the issue of running an external merge in BG or FG. In ConcurrentMergeScheduler.merge, an external merge is run in FG, not in BG. But in ConcurrentMergeScheduler.MergeThread.run, whether a merge is external is no longer checked. Why this difference?
      Hide
      mikemccand Michael McCandless added a comment -

      > I don't think so: with autoCommit=true, the merges calls commit(long)
      > after finishing, and I think we want those commit calls to run
      > concurrently?

      After we disable autoCommit, all commit calls will be serialized, right?

      Right.

      Back to the issue of running an external merge in BG or FG.
      In ConcurrentMergeScheduler.merge, an external merge is run in FG,
      not in BG. But in ConcurrentMergeScheduler.MergeThread.run,
      whether a merge is external is no longer checked. Why this difference?

      Good point! We no longer need to check for isExternal in CMS's merge() method – we can run all merges in the BG. In fact I think it's no longer necessary to even compute & record isExternal (this was its only use). Hmmm, except when I take this out I'm seeing testAddIndexOnDiskFull hang. I'll dig.

      Show
      mikemccand Michael McCandless added a comment - > I don't think so: with autoCommit=true, the merges calls commit(long) > after finishing, and I think we want those commit calls to run > concurrently? After we disable autoCommit, all commit calls will be serialized, right? Right. Back to the issue of running an external merge in BG or FG. In ConcurrentMergeScheduler.merge, an external merge is run in FG, not in BG. But in ConcurrentMergeScheduler.MergeThread.run, whether a merge is external is no longer checked. Why this difference? Good point! We no longer need to check for isExternal in CMS's merge() method – we can run all merges in the BG. In fact I think it's no longer necessary to even compute & record isExternal (this was its only use). Hmmm, except when I take this out I'm seeing testAddIndexOnDiskFull hang. I'll dig.
      Hide
      mikemccand Michael McCandless added a comment -

      OK new patch attached with changes discussed above.

      I did fix CMS to happily perform merges involving external segments
      with its BG threads. The hang I was seeing before was because as each
      BG thread hit the disk-full exception (in the test), it would abort
      that thread, and eventually no threads were doing merges even though
      merges were still pending. So copyExternalSegments would then
      wait forever.

      The fix was simple: I changed resolveExternalSegments (renamed from
      copyExternalSegments) to pick up any pending merges that involve
      external segments and run the merge itself, only falling back to the
      "wait" call when all such merges were already in progress in CMS.
      This way the disk full error is hit in the FG and the transaction (in
      addIndexesNoOptimize) unwinds.

      Show
      mikemccand Michael McCandless added a comment - OK new patch attached with changes discussed above. I did fix CMS to happily perform merges involving external segments with its BG threads. The hang I was seeing before was because as each BG thread hit the disk-full exception (in the test), it would abort that thread, and eventually no threads were doing merges even though merges were still pending. So copyExternalSegments would then wait forever. The fix was simple: I changed resolveExternalSegments (renamed from copyExternalSegments) to pick up any pending merges that involve external segments and run the merge itself, only falling back to the "wait" call when all such merges were already in progress in CMS. This way the disk full error is hit in the FG and the transaction (in addIndexesNoOptimize) unwinds.
      Hide
      ningli Ning Li added a comment -

      Maybe this should be a separate JIRA issue. In doWait(), the comment says "as a defense against thread timing hazards where notifyAll() falls to be called, we wait for at most 1 second..." In some cases, it seems that notifyAll() simply isn't called, such as some of the cases related to runningMerges. Maybe we should take a closer look at and possibly simplify the concurrency control in IndexWriter, especially when autoCommit is disabled?

      Show
      ningli Ning Li added a comment - Maybe this should be a separate JIRA issue. In doWait(), the comment says "as a defense against thread timing hazards where notifyAll() falls to be called, we wait for at most 1 second..." In some cases, it seems that notifyAll() simply isn't called, such as some of the cases related to runningMerges. Maybe we should take a closer look at and possibly simplify the concurrency control in IndexWriter, especially when autoCommit is disabled?
      Hide
      mikemccand Michael McCandless added a comment -

      Maybe we should take a closer look at and possibly simplify the concurrency control in IndexWriter, especially when autoCommit is disabled?

      I agree – I'm looking forward to taking autoCommit=true case out in 3.0. I'll try to simplify the concurrency control at that point, and test for any deadlocks if doWait is replaced with the "real" wait(), to catch any missing notifyAll()'s.

      Show
      mikemccand Michael McCandless added a comment - Maybe we should take a closer look at and possibly simplify the concurrency control in IndexWriter, especially when autoCommit is disabled? I agree – I'm looking forward to taking autoCommit=true case out in 3.0. I'll try to simplify the concurrency control at that point, and test for any deadlocks if doWait is replaced with the "real" wait(), to catch any missing notifyAll()'s.
      Hide
      mikemccand Michael McCandless added a comment -

      Ning (or anyone), any more feedback on this one? Else I plan to commit soon...

      Show
      mikemccand Michael McCandless added a comment - Ning (or anyone), any more feedback on this one? Else I plan to commit soon...
      Hide
      mikemccand Michael McCandless added a comment -

      Committed revision 690537

      Show
      mikemccand Michael McCandless added a comment - Committed revision 690537

        People

        • Assignee:
          mikemccand Michael McCandless
          Reporter:
          mikemccand Michael McCandless
        • Votes:
          0 Vote for this issue
          Watchers:
          1 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved:

            Development