Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.1
-
None
Description
We've received several negative pieces of feedback about LFS performance with enabled persistence. Ignite node stops responding to user operations under intensive load. Typical operations/second graph is attached.
Zero dropdowns happen during ongoing checkpoint when checkpoint buffer (memory segment that accumulates old versions of dirty pages that are not yet written in current checkpoint) is overflowed.
In general, performance decrease is inevitable - writing in memory is always faster than writing to disk. Though, we can avoid zero dropdowns if we'll throttle threads that generate dirty pages.
We can manage amount of throttle time with tocken bucket algorithm:
1) Before checkpoint start, we calculate ratio K = (number of checkpoint pages) / (size of checkpoint buffer) and initialize non-negative atomic marker counter
2) Every checkpointing thread increments marker counter once per K written pages
3) Any thread that makes page dirty should decrement marker counter. Thread should wait if marker counter is zero.
Such algorithm makes buffer overflow impossible. If activity is intensive and constant, user threads will write at the speed of the disk. On the other hand, user threads will write at maximum speed in case of burst activity.
Attachments
Attachments
Issue Links
- is related to
-
IGNITE-7175 Throttling is not applied to page allocation
- Resolved
-
IGNITE-7182 Slow sorting of pages collection on checkpoint begin can cause zero dropdown even with throttling enabled
- Resolved
-
IGNITE-7533 Throttle writting threads according fsync progress and checkpoint writting speed instead of region fill
- Resolved
- links to