Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-10448

MergeRateLimiter doesn't always limit instant rate.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • 8.11.1
    • None
    • core/other
    • None
    • New

    Description

      We can see the code in MergeRateLimiter:

      private long maybePause(long bytes, long curNS) throws MergePolicy.MergeAbortedException {
         
          double rate = mbPerSec; 
          double secondsToPause = (bytes / 1024. / 1024.) / rate;
          long targetNS = lastNS + (long) (1000000000 * secondsToPause);
          long curPauseNS = targetNS - curNS;
      
          // We don't bother with thread pausing if the pause is smaller than 2 msec.
          if (curPauseNS <= MIN_PAUSE_NS) {
            // Set to curNS, not targetNS, to enforce the instant rate, not
            // the "averaged over all history" rate:
            lastNS = curNS;
            return -1;
          }
         ......
        }
      

      If a Segment is been merged, maybePause is called in 7:00, lastNS=7:00, then the maybePause is called in 7:05 again, so the value of targetNS=lastNS + (long) (1000000000 * secondsToPause) must be smaller than curNS, no matter how big the bytes is, we will return -1 and ignore to pause.

      I count the total times(callTimes) calling maybePause and ignored pause times(ignorePauseTimes) and detail ignored bytes(detailBytes):

      [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
      

      There are 857 times calling maybePause, including 25 times which is ignored to pause, we can see that the ignored detail bytes (such as 0.28125mb) are not small.

      As long as the interval between two maybePause calls is relatively long, the pause action that should be executed will not be executed.

      Attachments

        Activity

          People

            Unassigned Unassigned
            kkewwei kkewwei
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 3h 10m
                3h 10m