[FLINK-18748] Savepoint would be queued unexpected if pendingCheckpoints less than maxConcurrentCheckpoints - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 1.11.0, 1.11.1
Fix Version/s: 1.11.2, 1.12.0
Component/s: Runtime / Checkpointing
Labels:
- pull-request-available

Description

After ~~FLINK-17342~~, when triggering a checkpoint/savepoint, we'll check whether the request can be triggered in CheckpointRequestDecider#chooseRequestToExecute, the logic is as follow:

Preconditions.checkState(Thread.holdsLock(lock));
// 1. 
if (isTriggering || queuedRequests.isEmpty()) {
   return Optional.empty();
}

// 2 too many ongoing checkpoitn/savepoint
if (pendingCheckpointsSizeSupplier.get() >= maxConcurrentCheckpointAttempts) {
   return Optional.of(queuedRequests.first())
      .filter(CheckpointTriggerRequest::isForce)
      .map(unused -> queuedRequests.pollFirst());
}

// 3 check the timestamp of last complete checkpoint
long nextTriggerDelayMillis = nextTriggerDelayMillis(lastCompletionMs);
if (nextTriggerDelayMillis > 0) {
   return onTooEarly(nextTriggerDelayMillis);
}

return Optional.of(queuedRequests.pollFirst());

But if currently pendingCheckpointsSizeSupplier.get() < maxConcurrentCheckpointAttempts, and the request is a savepoint, the savepoint will still wait some time in step 3.

I think we should trigger the savepoint immediately if pendingCheckpointSizeSupplier.get() < maxConcurrentCheckpointAttempts.

Attachments

Issue Links

links to

GitHub Pull Request #13045

Activity

People

Assignee:: tao wang

Reporter:: Congxian Qiu

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 29/Jul/20 05:32

Updated:: 04/Aug/20 08:01

Resolved:: 04/Aug/20 08:00