Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-34576

Flink deployment keep staying at RECONCILING/STABLE status

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • kubernetes-operator-1.6.1
    • None
    • Kubernetes Operator
    • None

    Description

      The HA mode of flink-kubernetes-operator is being used. When one of the pods of flink-kubernetes-operator restarts, flink-kubernetes-operator switches the leader. However, some flinkdeployments have been in the JOB_STATUS=RECONCILING&LIFECYCLE_STATE=STABLE state for a long time.

      Through the cmd "kubectl describe flinkdeployment xxx", can see the following error, but there are no exceptions in the flink-kubernetes-operator log.

       

      Status:
        Cluster Info:
          Flink - Revision:             b6d20ed @ 2023-12-20T10:01:39+01:00
          Flink - Version:              1.14.0-GDC1.6.0
          Total - Cpu:                  7.0
          Total - Memory:               30064771072
        Error:                          {"type":"org.apache.flink.kubernetes.operator.exception.ReconciliationException","message":"org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException: java.lang.RuntimeException: Failed to load configuration","additionalMetadata":{},"throwableList":[{"type":"org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException","message":"java.lang.RuntimeException: Failed to load configuration","additionalMetadata":{}},{"type":"java.lang.RuntimeException","message":"Failed to load configuration","additionalMetadata":{}}]}
        Job Manager Deployment Status:  READY
        Job Status:
          Job Id:    cf44b5e73a1f263dd7d9f2c82be5216d
          Job Name:  noah_stream_studio_1754211682_2218100380
          Savepoint Info:
            Last Periodic Savepoint Timestamp:  0
            Savepoint History:
          Start Time:     1705635107137
          State:          RECONCILING
          Update Time:    1709272530741
        Lifecycle State:  STABLE 

       

       

      version:

      flink-kubernetes-operator: 1.6.1

      flink: 1.14.0/1.15.2 (flinkdeployment 1200+)

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            stupid_pig chenyuzhi
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: