Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-10265

IO write throttle rate will beyond the Ceiling(1024MB/s) in the merge

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 8.6.2
    • None
    • core/other
    • None
    • New

    Description

      It's known that merge io write throttle rate is under the control of `targetMBPerSec` In ConcurrentMergeSchedule, it should't beyond the Ceiling(1024MB/s).

      `targetMBPerSec` is shared by many merge threads, it will be changed by the way:

      if (newBacklog) {
            // This new merge adds to the backlog: increase IO throttle by 20%
            targetMBPerSec *= 1.20; 
            if (targetMBPerSec > MAX_MERGE_MB_PER_SEC) {
              targetMBPerSec = MAX_MERGE_MB_PER_SEC;
            }
            ......
      } else {
            // We are not falling behind: decrease IO throttle by 10%
            targetMBPerSec /= 1.10;
            if (targetMBPerSec < MIN_MERGE_MB_PER_SEC) {
              targetMBPerSec = MIN_MERGE_MB_PER_SEC;
            }
           ......
      }
      

      The modification process is not a atomic operation:

      1. `targetMBPerSec` is changed by the first merge thread from 1024 to 1024*1.2
      2. other merge thread will read the new value(1024*1.2).
      3. the first merge thread limit the value to be 1024.

      The bad case will happen.

      In product, we do find that IO write throttle rate is beyond the Ceiling(1024MB/s) in the merge.

      [2021-11-26T15:27:19,861][TRACE][o.e.i.e.E.MS             ] [data1] [test1][25] elasticsearch[data1][refresh][T#5] MS: io throttle: current merge backlog; leave IO rate at 3589.1 MB/sec
      [2021-11-26T15:27:20,304][TRACE][o.e.i.e.E.MS             ] [data1] [test1][13] elasticsearch[data1][write][T#3] MS: io throttle: current merge backlog; leave IO rate at 192.4 MB/sec
      [2021-11-26T15:27:25,330][TRACE][o.e.i.e.E.MS             ] [data1] [test1][22] elasticsearch[data1][[test1][22]: Lucene Merge Thread #1026] MS: io throttle: current merge backlog; leave IO rate at 96.3 MB/sec
      [2021-11-26T15:27:25,995][TRACE][o.e.i.e.E.MS             ] [data1] [test1][16] elasticsearch[data1][[test1][16]: Lucene Merge Thread #1063] MS: io throttle: current merge backlog; leave IO rate at 419.2 MB/sec
      [2021-11-26T15:27:38,335][TRACE][o.e.i.e.E.MS             ] [data1] [test1][19] elasticsearch[data1][write][T#2] MS: io throttle: current merge backlog; leave IO rate at 3051.5 MB/sec
      

      If we shoud do the following:
      1. changing it by the atomic operation.
      2. adding the `volatile` attribute to `targetMBPerSec`.

      Attachments

        Activity

          People

            Unassigned Unassigned
            kkewwei kkewwei
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: