Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-31474

Add failure information for out-of-order checkpoints

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      At present, when the checkpoint is out of order, only out-of-order logs will be printed on the Task side, while on the JM side, the checkpoint can only fail through timeout, and the real reason cannot be confirmed.

      Therefore, I think we should add failure information on the JM side for the out-of-order checkpoint.

      if (lastCheckpointId >= metadata.getCheckpointId()) {
          LOG.info(
                  "Out of order checkpoint barrier (aborted previously?): {} >= {}",
                  lastCheckpointId,
                  metadata.getCheckpointId());
          channelStateWriter.abort(metadata.getCheckpointId(), new CancellationException(), true);
          checkAndClearAbortedStatus(metadata.getCheckpointId());
          return;
      } 

      Attachments

        Activity

          People

            Unassigned Unassigned
            Ming Li Ming Li
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: