Details
-
Epic
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
TarMK transaction rate
Description
The TarMK's write throughput is limited by the way concurrent commits are processed: rebasing and running the commit hooks happen within a lock without any explicit scheduling. This epic covers improving the overall transaction rate. The proposed approach would roughly be to first make scheduling of transactions explicit, then add monitoring on transaction to gather a better understanding and then experiment and implement explicit scheduling strategies to optimise particular aspects.
Summary of ideas mentioned in an offline sessions
Advantages of explicit scheduling:
- Control over (order) of commits
- Sophisticated monitoring (commit statistics, e.g. commit rate, time in queue, etc.)
- Favour certain commits (e.g. checkpoints)
- Reorder commits to simplify rebasing
- Suspend the compactor on concurrent commits and have it resume where it left off afterwards
- Parallelise certain commits (e.g. by piggy backing)
- Implement a concurrent commit editor. we'd need to take care of proper access to the shared state; Francesco Mari maybe introduce the idea of a common context to enforce concurrent access semantics.
Scheduler Implementation
- Expedite
- Prioritise
- Defer
- Collapse
- Coalesce
- Parallelise
- Piggy back: can we piggy back commits on top of each other? The idea would be while processing the changes of one commit to also check them for conflicts with the changes of other commits waiting to commit. If a conflict is detected there, that other commit can immediately be failed (given the current commit doesn't fail).
- Merging non conflicting commits. Given multiple transactions ready to commit at the same time. Can we process them as one (given they don't conflict) instead of one after each other, which requires rebasing the later transaction to be rebase on the former.
- Shield the file store from InterruptedException because of thread boundaries introduced
- Implement tests, benchmarks and fixtures for verification
Attachments
Attachments
Issue Links
- incorporates
-
OAK-1576 SegmentMK: Implement refined conflict resolution for addExistingNode conflicts
- Open
Issues in epic
|
OAK-4122 | Replace the commit semaphore in the segment node store with a scheduler | Closed | Andrei Dulceanu | ||
|
OAK-7162 | Race condition on revisions head between compaction and scheduler could result in skipped commit | Closed | Andrei Dulceanu | ||
|
OAK-4732 | (Slightly) prioritise reads over writes | Closed | Andrei Dulceanu | ||
|
OAK-5853 | Potential expensive call to NodeState.getChildNodeCount() in constructor of Template | Resolved | Michael Dürig | ||
|
OAK-6051 | Clarify migration tests failures when switching Commit#hasChanges implementations | Closed | Andrei Dulceanu | ||
|
OAK-6065 | Rework CheckpointTest and MergeTest after introducing the scheduler | Closed | Andrei Dulceanu | ||
|
OAK-6074 | Simplify merge logic in LockBasedScheduler | Closed | Andrei Dulceanu | ||
|
OAK-6137 | Remove call to getHeadNodeState in LockBasedScheduler constructor | Closed | Andrei Dulceanu | ||
|
OAK-6138 | Remove addObserver method from Scheduler API | Closed | Andrei Dulceanu | ||
|
OAK-6428 | Add flag for controlling percentile of commit time used in scheduler | Closed | Andrei Dulceanu | ||
|
OAK-6430 | Remove Apache Commons Math3 dependency from Segment Tar | Closed | Andrei Dulceanu |
OAK-5464
TarMK transaction rate
false
OAK-5464
TarMK transaction rate