Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-6334

Throttle writing threads during ongoing checkpoint with token bucket algorithm

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.1
    • 2.3
    • persistence
    • None

    Description

      We've received several negative pieces of feedback about LFS performance with enabled persistence. Ignite node stops responding to user operations under intensive load. Typical operations/second graph is attached.
      Zero dropdowns happen during ongoing checkpoint when checkpoint buffer (memory segment that accumulates old versions of dirty pages that are not yet written in current checkpoint) is overflowed.
      In general, performance decrease is inevitable - writing in memory is always faster than writing to disk. Though, we can avoid zero dropdowns if we'll throttle threads that generate dirty pages.
      We can manage amount of throttle time with tocken bucket algorithm:
      1) Before checkpoint start, we calculate ratio K = (number of checkpoint pages) / (size of checkpoint buffer) and initialize non-negative atomic marker counter
      2) Every checkpointing thread increments marker counter once per K written pages
      3) Any thread that makes page dirty should decrement marker counter. Thread should wait if marker counter is zero.
      Such algorithm makes buffer overflow impossible. If activity is intensive and constant, user threads will write at the speed of the disk. On the other hand, user threads will write at maximum speed in case of burst activity.

      Attachments

        1. opsec3.jpg
          88 kB
          Ivan Rakov

        Issue Links

          Activity

            People

              ivan.glukos Ivan Rakov
              ivan.glukos Ivan Rakov
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m