Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-18962

Improve error message if checkpoint directory is not writable

    XMLWordPrintableJSON

Details

    Description

      If the checkpoint directory from state.checkpoints.dir is not writable by the user that Flink is running with, checkpoints will be declined, but the real cause is not mentioned anywhere:

      • the Web UI says: "Cause: The job has failed" (the Flink job is running though)
      • the JM log says:
        2020-08-14 12:13:18,820 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator    [] - Triggering checkpoint 2 (type=CHECKPOINT) @ 1597399998819 for job 2c567b14e8d0833404931ef47dfec266.
        2020-08-14 12:13:18,921 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator    [] - Decline checkpoint 2 by task 0d4fd75374ad16c8d963679e3c2171ec of job 2c567b14e8d0833404931ef47dfec266 at a184deea621e3923fbfcb1d899348448 @ Nico-PC.lan (dataPort=35531).
        
      • the TM log says:
        2020-08-14 12:13:14,102 INFO  org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl [] - Checkpoint 1 has been notified as aborted, would not trigger any checkpoint.
        

      And that's it. It should have a real error message indicating that the checkpoint (sub)-directory could not be created.

      Attachments

        Issue Links

          Activity

            People

              roman Roman Khachatryan
              nkruber Nico Kruber
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: