Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
CheckpointProgressImpl#onStartPartitionProcessing and CheckpointProgressImpl#onFinishPartitionProcessing don't work as intended for several reasons:
- There's a race, we could call onFinish before onStart is called in a concurrent thread. This might happen if there's only a handful of dirty pages in each partition and there are more than one checkpoint threads. Basically, this protection doesn't work.
- Even if that particular race wouldn't exits, this code still doesn't work, because some of pages could be added to pageIdsToRetry map. That map will be processed later, when writePages is finished, manning that we mark unfinished partitions as finished.
- Due to aforementioned bugs, I didn't bother including these methods to drainCheckpointBuffers. As a result, this method requires a fix too
Upd:
The first and second problems can be solved within the IGNITE-23115, when the writing of pages of one partition is made by only one thread, it will be necessary to check.
After a thoughtful analysis, I found out that there is no race. So I renamed some methods and added documentation to them. And also fix drainCheckpointBuffers.
Attachments
Issue Links
- is related to
-
IGNITE-23115 Checkpoint single partition from a single thread
- Resolved
- links to