Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-10129

Flink job IDs are not getting deleted automatically from zookeeper metadata after canceling flink job in flink HA cluster

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Hi Team,

      Here is, what i am looking for:

      • We have  flink HA dockerized cluster with (3 zookeepers, 2 job-managers, 3 task-managers) 
      • So whenever we are cancelling the flink job, it is getting cancelled but it is not deleting the cancelled job ID from the zookeeper metadata (Inside flink/jobgraph folder in zookeeper) automatically. 
      • So whenever any one of the job-manager goes down/restarted , it doesn't come up and throws exception like  "Could not find this job id xxxxxxxxxx".
      • The current work around is to remove the canceled job ID from the zookeeper metadata manually. (But this is not the recommended solution).     

       

      Please advise.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                keshav.lodhi@widas.in Keshav Lodhi
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: