Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7700

Move throughput control and merge aborting out of IndexWriter's core?

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 6.x, 7.0, 6.5
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Here is a bit of a background:

      • I wanted to implement a custom merging strategy that would have a custom i/o flow control (global),
      • currently, the CMS is tightly coupled with a few classes – MergeRateLimiter, OneMerge, IndexWriter.

      Looking at the code it seems to me that everything with respect to I/O control could be nicely pulled out into classes that explicitly control the merging process, that is only MergePolicy and MergeScheduler. By default, one could even run without any additional I/O accounting overhead (which is currently in there, even if one doesn't use the CMS's throughput control).

      Such refactoring would also give a chance to nicely move things where they belong – job aborting into OneMerge (currently in RateLimiter), rate limiter lifecycle bound to OneMerge (MergeScheduler could then use per-merge or global accounting, as it pleases).

      Just a thought and some initial refactorings for discussion.

      1. LUCENE-7700.patch
        43 kB
        Dawid Weiss
      2. LUCENE-7700.patch
        42 kB
        Dawid Weiss
      3. LUCENE-7700.patch
        39 kB
        Dawid Weiss
      4. LUCENE-7700.patch
        16 kB
        Dawid Weiss

        Activity

        Hide
        dweiss Dawid Weiss added a comment -

        A tentative first patch as a proof of feasibility. test-core passes for me without a problem.

        Show
        dweiss Dawid Weiss added a comment - A tentative first patch as a proof of feasibility. test-core passes for me without a problem.
        Show
        dweiss Dawid Weiss added a comment - I created a fork here with up-to-date patch here: https://github.com/dweiss/lucene-solr/tree/LUCENE-7700 Overview: https://github.com/apache/lucene-solr/compare/master...dweiss:LUCENE-7700?expand=1
        Hide
        mikemccand Michael McCandless added a comment -

        Thank you Dawid Weiss for giving this some attention ... this intertwining is horribly messy today! Your patch is a nice step forward.

        One difference with your patch is we would now wrap the Directory for merge on every merge, instead of once up front, but that's fine (the cost is tiny vs. cost of the merge), and, possibly powerful, since each merge can now decide what to do about IO throttling / Directory wrapping.

        And it's nice that we can remove IW's ThreadLocal tracking the rate limiters.

        // TODO: no throughput control after changes; should we comply with merge scheduler/ policy here?

        I do think this it's important that the IO throttling applies when building the CFS file? For a large merge, this is a big burst of IO in the end ... too bad we can't use an API like Linux's splice to efficiently copy bytes (though we'd likely still want throttling there too anyway...).

        Show
        mikemccand Michael McCandless added a comment - Thank you Dawid Weiss for giving this some attention ... this intertwining is horribly messy today! Your patch is a nice step forward. One difference with your patch is we would now wrap the Directory for merge on every merge, instead of once up front, but that's fine (the cost is tiny vs. cost of the merge), and, possibly powerful, since each merge can now decide what to do about IO throttling / Directory wrapping. And it's nice that we can remove IW's ThreadLocal tracking the rate limiters. // TODO: no throughput control after changes; should we comply with merge scheduler/ policy here? I do think this it's important that the IO throttling applies when building the CFS file? For a large merge, this is a big burst of IO in the end ... too bad we can't use an API like Linux's splice to efficiently copy bytes (though we'd likely still want throttling there too anyway...).
        Hide
        dweiss Dawid Weiss added a comment - - edited

        One difference with your patch is we would now wrap the Directory for merge on every merge, instead of once up front, but that's fine (the cost is tiny vs. cost of the merge)

        I admit I do have a very specific scenario at hand and you're infinitely more experienced with merging, so if this is a problem we can always change it! The "get-directory-wrapped-for-merging" part is a bit clumsy, but I didn't figure out how to do it better.

        And it's nice that we can remove IW's ThreadLocal tracking the rate limiters.

        I think so too.

        I do think this it's important that the IO throttling applies when building the CFS file? For a large merge, this is a big burst of IO in the end

        That part I didn't look too closely at, I agree. It should definitely be consistent with the rest of the throughput-control code, but there's no OneMerge instance there to work with... I'll take another look, maybe I'll come up with something.

        Show
        dweiss Dawid Weiss added a comment - - edited One difference with your patch is we would now wrap the Directory for merge on every merge, instead of once up front, but that's fine (the cost is tiny vs. cost of the merge) I admit I do have a very specific scenario at hand and you're infinitely more experienced with merging, so if this is a problem we can always change it! The "get-directory-wrapped-for-merging" part is a bit clumsy, but I didn't figure out how to do it better. And it's nice that we can remove IW's ThreadLocal tracking the rate limiters. I think so too. I do think this it's important that the IO throttling applies when building the CFS file? For a large merge, this is a big burst of IO in the end That part I didn't look too closely at, I agree. It should definitely be consistent with the rest of the throughput-control code, but there's no OneMerge instance there to work with... I'll take another look, maybe I'll come up with something.
        Hide
        dweiss Dawid Weiss added a comment -

        Mike, even in current master the merge in addIndexes isn't going through mergeDirectory, but through the original directory, suppressing any bandwidth control?

              // TODO: somehow we should fix this merge so it's
              // abortable so that IW.close(false) is able to stop it
              TrackingDirectoryWrapper trackingDir = new TrackingDirectoryWrapper(directory);
        

        So it'd make sense to fix both of these, right?

        Show
        dweiss Dawid Weiss added a comment - Mike, even in current master the merge in addIndexes isn't going through mergeDirectory, but through the original directory, suppressing any bandwidth control? // TODO: somehow we should fix this merge so it's // abortable so that IW.close( false ) is able to stop it TrackingDirectoryWrapper trackingDir = new TrackingDirectoryWrapper(directory); So it'd make sense to fix both of these, right?
        Hide
        dweiss Dawid Weiss added a comment -

        Here's another iteration. It came out quite cleanly I think. In short: I moved wrapForMerge to be a method of MergeScheduler. Aborting, pausing and timings on OneMerge are now part of a dedicated class (OneMergeProgress) and are entirely abstracted away from throughput control. In fact, now only ConcurrentMergeScheduler has access to bandwidth control (although it can be fairly easily added to any other scheduler).

        IndexWriter.addIndexec(Codec) doesn't respect merge scheduler's policies (and it wasn't before, so this isn't a breaking change).

        The APIs have changed in a few places (didn't do a thorough check yet). Seems like a nice cleanup that untangles different concerns to different places.

        Everything would have to be triple-checked for correctness. I dropped synchronized blocks in a few places where a simple volatile variable was more adequate (and very likely much faster).

        Show
        dweiss Dawid Weiss added a comment - Here's another iteration. It came out quite cleanly I think. In short: I moved wrapForMerge to be a method of MergeScheduler. Aborting, pausing and timings on OneMerge are now part of a dedicated class (OneMergeProgress) and are entirely abstracted away from throughput control. In fact, now only ConcurrentMergeScheduler has access to bandwidth control (although it can be fairly easily added to any other scheduler). IndexWriter.addIndexec(Codec) doesn't respect merge scheduler's policies (and it wasn't before, so this isn't a breaking change). The APIs have changed in a few places (didn't do a thorough check yet). Seems like a nice cleanup that untangles different concerns to different places. Everything would have to be triple-checked for correctness. I dropped synchronized blocks in a few places where a simple volatile variable was more adequate (and very likely much faster).
        Hide
        mikemccand Michael McCandless added a comment -

        Thanks Dawid Weiss; I'll have a look...

        Show
        mikemccand Michael McCandless added a comment - Thanks Dawid Weiss ; I'll have a look...
        Hide
        dweiss Dawid Weiss added a comment -

        No rush. I've corrected a few javadocs (github branch) and the tests and precommit pass for me.

        Show
        dweiss Dawid Weiss added a comment - No rush. I've corrected a few javadocs (github branch) and the tests and precommit pass for me.
        Hide
        mikemccand Michael McCandless added a comment -

        Nice job fixing a few ancient typos

        Looks like javadocs for the private MergeRateLimiter.maybePause method are stale?

        Why are we creating MergeRateLimiter on init of MergeThread and then again in CMS.wrapForMerge? Seems like we could cast the current thread to MergeThread and get its already-created instance?

        Why not updateIOThrottle in the main CMS thread, not the merge thread? Else, I think we have an added delay, from when a backlog'd merge shows up, to when the already running merge threads kick up their IO throttle?

        Maybe add a comment to OneMergeProgress.owner and .setMergeThread that it's only used for catching misuse?

        Can we rename OneMergeProgress.pauseTimes -> pauseTimesNanos or NS?

        You can just remove the //private final Directory mergeDirectory from IW.

        Hmm it looks like CFS building is still unthrottled?

        Show
        mikemccand Michael McCandless added a comment - Nice job fixing a few ancient typos Looks like javadocs for the private MergeRateLimiter.maybePause method are stale? Why are we creating MergeRateLimiter on init of MergeThread and then again in CMS.wrapForMerge ? Seems like we could cast the current thread to MergeThread and get its already-created instance? Why not updateIOThrottle in the main CMS thread, not the merge thread? Else, I think we have an added delay, from when a backlog'd merge shows up, to when the already running merge threads kick up their IO throttle? Maybe add a comment to OneMergeProgress.owner and .setMergeThread that it's only used for catching misuse? Can we rename OneMergeProgress.pauseTimes -> pauseTimesNanos or NS? You can just remove the //private final Directory mergeDirectory from IW. Hmm it looks like CFS building is still unthrottled?
        Hide
        dweiss Dawid Weiss added a comment -

        I'm on short holidays, Mike. Will reply on Tuesday in full. But the CFS thing – I don't think I changed anything there; that part in addIndexes was never really throttled properly... I think it should just run as fast as possible. Either this, or we should make it comply with mergescheduler's throttling strategy (although this would require creating onemerge, which is odd).

        Show
        dweiss Dawid Weiss added a comment - I'm on short holidays, Mike. Will reply on Tuesday in full. But the CFS thing – I don't think I changed anything there; that part in addIndexes was never really throttled properly... I think it should just run as fast as possible. Either this, or we should make it comply with mergescheduler's throttling strategy (although this would require creating onemerge, which is odd).
        Hide
        mikemccand Michael McCandless added a comment -

        But the CFS thing – I don't think I changed anything there;

        Aha! You are correct! I was mis-reading the patch; I didn't realize the CFS change was just for addIndexes, but you're right. Building CFS for a merged segment is in fact going through the wrapped Directory, so it can be throttled ... good. I agree we should not change addIndexes behavior here.

        Show
        mikemccand Michael McCandless added a comment - But the CFS thing – I don't think I changed anything there; Aha! You are correct! I was mis-reading the patch; I didn't realize the CFS change was just for addIndexes , but you're right. Building CFS for a merged segment is in fact going through the wrapped Directory, so it can be throttled ... good. I agree we should not change addIndexes behavior here.
        Hide
        dweiss Dawid Weiss added a comment - - edited

        Thanks for comments Mike!

        Looks like javadocs for the private MergeRateLimiter.maybePause method are stale?

        Corrected. I also changed some internal comments concerning waits < 1ms. (these are
        possible with the new locks API, but we still don't bother) and introduced some
        more informative constants where appropriate.

        Why are we creating MergeRateLimiter on init of MergeThread and then again in CMS.wrapForMerge? Seems like we could cast the current thread to MergeThread and get its already-created instance?

        Good catch, corrected.

        Why not updateIOThrottle in the main CMS thread, not the merge thread? Else, I think we have an added delay, from when a backlog'd merge shows up, to when the already running merge threads kick up their IO throttle?

        I admit I didn't try to understand all of the CMS's rate-limit logic as it was quite complex, so
        I don't understand you exactly here. Start of the merge thread seemed like a sensible place to run the update IO throttle update, but I moved it back – doesn't seem to hurt anything.

        Maybe add a comment to OneMergeProgress.owner and .setMergeThread that it's only used for catching misuse?

        Done.

        Can we rename OneMergeProgress.pauseTimes -> pauseTimesNanos or NS?

        Hehe... sure, sure.

        You can just remove the //private final Directory mergeDirectory from IW.

        Done.

        Hmm it looks like CFS building is still unthrottled?

        Already discussed.

        Running tests now.

        Show
        dweiss Dawid Weiss added a comment - - edited Thanks for comments Mike! Looks like javadocs for the private MergeRateLimiter.maybePause method are stale? Corrected. I also changed some internal comments concerning waits < 1ms. (these are possible with the new locks API, but we still don't bother) and introduced some more informative constants where appropriate. Why are we creating MergeRateLimiter on init of MergeThread and then again in CMS.wrapForMerge? Seems like we could cast the current thread to MergeThread and get its already-created instance? Good catch, corrected. Why not updateIOThrottle in the main CMS thread, not the merge thread? Else, I think we have an added delay, from when a backlog'd merge shows up, to when the already running merge threads kick up their IO throttle? I admit I didn't try to understand all of the CMS's rate-limit logic as it was quite complex, so I don't understand you exactly here. Start of the merge thread seemed like a sensible place to run the update IO throttle update, but I moved it back – doesn't seem to hurt anything. Maybe add a comment to OneMergeProgress.owner and .setMergeThread that it's only used for catching misuse? Done. Can we rename OneMergeProgress.pauseTimes -> pauseTimesNanos or NS? Hehe... sure, sure. You can just remove the //private final Directory mergeDirectory from IW. Done. Hmm it looks like CFS building is still unthrottled? Already discussed. Running tests now.
        Hide
        dweiss Dawid Weiss added a comment -

        I screwed up something in the latest patch because I'm getting assertion errors, will fix.

        Show
        dweiss Dawid Weiss added a comment - I screwed up something in the latest patch because I'm getting assertion errors, will fix.
        Hide
        dweiss Dawid Weiss added a comment -

        Ok, that was a trivial regression:

        --- a/lucene/core/src/java/org/apache/lucene/index/MergePolicy.java
        +++ b/lucene/core/src/java/org/apache/lucene/index/MergePolicy.java
        @@ -177,7 +177,7 @@ public abstract class MergePolicy {
             }
        
             final void setMergeThread(Thread owner) {
        -      assert owner == null;
        +      assert this.owner == null;
               this.owner = owner;
             }
           }
        
        Show
        dweiss Dawid Weiss added a comment - Ok, that was a trivial regression: --- a/lucene/core/src/java/org/apache/lucene/index/MergePolicy.java +++ b/lucene/core/src/java/org/apache/lucene/index/MergePolicy.java @@ -177,7 +177,7 @@ public abstract class MergePolicy { } final void setMergeThread( Thread owner) { - assert owner == null ; + assert this .owner == null ; this .owner = owner; } }
        Hide
        mikemccand Michael McCandless added a comment -

        Thanks Dawid Weiss the new patch looks great! +1 to push, and thank you for cleaning this up!

        Show
        mikemccand Michael McCandless added a comment - Thanks Dawid Weiss the new patch looks great! +1 to push, and thank you for cleaning this up!
        Hide
        dweiss Dawid Weiss added a comment -

        Thanks for reviewing, Mike. I'm guessing this should go to 7.x only, right? Or can we backport it to 6.x as well, risking some minor API incompatibilities (these are internal APIs anyway, but who knows).

        Show
        dweiss Dawid Weiss added a comment - Thanks for reviewing, Mike. I'm guessing this should go to 7.x only, right? Or can we backport it to 6.x as well, risking some minor API incompatibilities (these are internal APIs anyway, but who knows).
        Hide
        mikemccand Michael McCandless added a comment -

        I think it's fine to backport to 6.x.

        Show
        mikemccand Michael McCandless added a comment - I think it's fine to backport to 6.x.
        Hide
        dweiss Dawid Weiss added a comment -

        Final patch.

        Show
        dweiss Dawid Weiss added a comment - Final patch.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 9540bc37583dfd4e995b893154039fcf031dc3c3 in lucene-solr's branch refs/heads/master from Dawid Weiss
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9540bc3 ]

        LUCENE-7700: Move throughput control and merge aborting out of IndexWriter's core.

        Show
        jira-bot ASF subversion and git services added a comment - Commit 9540bc37583dfd4e995b893154039fcf031dc3c3 in lucene-solr's branch refs/heads/master from Dawid Weiss [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9540bc3 ] LUCENE-7700 : Move throughput control and merge aborting out of IndexWriter's core.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 8c5ea32bb9f2d9d8af98162e1e19c9559c8c602d in lucene-solr's branch refs/heads/branch_6x from Dawid Weiss
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=8c5ea32 ]

        LUCENE-7700: Move throughput control and merge aborting out of IndexWriter's core.

        Show
        jira-bot ASF subversion and git services added a comment - Commit 8c5ea32bb9f2d9d8af98162e1e19c9559c8c602d in lucene-solr's branch refs/heads/branch_6x from Dawid Weiss [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=8c5ea32 ] LUCENE-7700 : Move throughput control and merge aborting out of IndexWriter's core.

          People

          • Assignee:
            dweiss Dawid Weiss
            Reporter:
            dweiss Dawid Weiss
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development