Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-25963

FLIP-212: Introduce Flink Kubernetes Operator

    XMLWordPrintableJSON

Details

    Attachments

      1.
      Initial Kubernetes Operator Prototype contribution Sub-task Closed Gyula Fora  
      2.
      Separate job and deployment errors in FlinkDeployment status Sub-task Closed Gyula Fora  
      3.
      Implement shared validation logic for FlinkDeployment objects Sub-task Closed Gyula Fora  
      4.
      Create webhook REST api test Sub-task Closed Nicholas Jiang  
      5.
      Create controller test for stateful/stateless upgrade flow Sub-task Closed Gyula Fora  
      6.
      Improve JobStatus tracking and handle different job states Sub-task Closed Unassigned  
      7.
      Support last-state upgrade mode Sub-task Closed Yang Wang  
      8.
      Integrate flink-kubernetes-operator repo with CI/CD Sub-task Closed Yang Wang  
      9.
      Containers Should Not Run As Root Sub-task Closed Matyas Orhidi  
      10.
      Refactor FlinkUtils#getEffectiveConfig into smaller pieces Sub-task Closed Biao Geng  
      11.
      Use the same enum for expected and observed jobstate (JobState / JobStatus.state) Sub-task Closed Biao Geng  
      12.
      Support manual savepoint triggering in the operator Sub-task Closed Matyas Orhidi  
      13.
      Support kubernetes-operator metrics using the Flink metric system Sub-task Closed Matyas Orhidi  
      14.
      Commit generated CRD into Git repo Sub-task Closed Matyas Orhidi  
      15.
      Publish Kubernetes operator to container registry Sub-task Closed Márton Balassi  
      16.
      Make 'replicas' work in JobManager Spec Sub-task Closed Biao Geng  
      17.
      Track and cap retries in ReconciliationStatus Sub-task Closed Matyas Orhidi  
      18.
      Document metrics configuration for Prometheus Sub-task Closed Matyas Orhidi  
      19.
      Support watching specific namespace for FlinkDeployments Sub-task Closed Gyula Fora  
      20.
      Reconciliation should try to start job when not already started or move to permanent error Sub-task Closed Thomas Weise  
      21.
      Revisit serviceAccount and taskSlots direct fields in CRD Sub-task Closed Thomas Weise  
      22.
      Remove need for flink-operator clusterrole Sub-task Closed Márton Balassi  
      23.
      Control Logging Behavior in Flink Deployments Sub-task Closed Matyas Orhidi  
      24.
      Move the Operator Env to the common Utils Sub-task Closed WenJun Min  
      25.
      Deletion should remove HA related configmaps also Sub-task Closed Gyula Fora  
      26.
      Avoid load flink conf each reconcile loop Sub-task Closed WenJun Min  
      27.
      Introduce the webhook config to free the environment options Sub-task Closed Unassigned  
      28.
      Revisit the create of RestClusterClient Sub-task Closed WenJun Min  
      29.
      Move sanity check in FlinkService#cancelJob to DefaultDeploymentValidator Sub-task Closed Nicholas Jiang  
      30.
      Make Flink cluster communication asynchronous Sub-task Closed Sandor Kelemen  
      31.
      Extract Reconciler interface Sub-task Closed WenJun Min  
      32.
      Make some option of operator configurable Sub-task Closed WenJun Min  
      33.
      Add validation check of num of JM replica Sub-task Closed Biao Geng  
      34.
      Cleanly separate validator, observer and reconciler modules Sub-task Closed Gyula Fora  
      35.
      SharedIndexInformer should respect watched namespaces Sub-task Closed Gyula Fora  
      36.
      Clean up webhook jar and dependency management Sub-task Closed Nicholas Jiang  
      37.
      Improve operator logging Sub-task Closed Gyula Fora  
      38.
      Deleting the operator while jobs are running causes the jobs to fail Sub-task Closed Márton Balassi  
      39.
      Introduce Savepoint object in JobStatus Sub-task Closed Matyas Orhidi  
      40.
      Observer should support JobManager deployment crashed or deleted externally Sub-task Closed Thomas Weise  
      41.
      Introduce flink-kubernetes-shaded to avoid overlapping classes Sub-task Closed Yang Wang  
      42.
      Last state upgrade mode should allow reconciliation regardless of job and deployment status Sub-task Closed Gyula Fora  
      43.
      Webhook should only validate on /validate endpoint end log errors for others Sub-task Closed Nicholas Jiang  
      44.
      Reconsider setting generationAwareEventProcessing = true Sub-task Closed Gyula Fora  
      45.
      Trigger the updateControl when the FlinkDeployment have changed Sub-task Closed WenJun Min  
      46.
      Ability to restart deployment w/o spec change Sub-task Closed WenJun Min  
      47.
      Extract Observer Interface Sub-task Closed WenJun Min  
      48.
      Try to use @EnableKubernetesMockClient(crud = true) in controller test Sub-task Closed Nicholas Jiang  
      49.
      Clean termination of FlinkDeployment Sub-task Closed Gyula Fora  
      50.
      Savepoint trigger/tracking improvements Sub-task Closed Matyas Orhidi  
      51.
      Re-schedule reconcile more often until job is in ready state Sub-task Closed Gyula Fora  
      52.
      Allow definining Operator configuration in Helm chart Values Sub-task Closed Gyula Fora  
      53.
      Avoid state loss when switching to last-state upgrade mode Sub-task Closed Yang Wang  
      54.
      Check if JM can serve rest api calls every time before reconcile Sub-task Closed Biao Geng  
      55.
      Mark CRD classes experimental Sub-task Closed Nicholas Jiang  
      56.
      Consider changing flinkVersion to enum type or removing it completely Sub-task Closed Gyula Fora  
      57.
      Add startTime in JobStatus Sub-task Closed Nicholas Jiang  
      58.
      Improve the observe logic in SessionObserver Sub-task Closed Biao Geng  
      59.
      Specify EventSource when watching multiple namespaces Sub-task Closed Gyula Fora  
      60.
      Set ClusterIP service type when watching specific namespaces Sub-task Closed Gyula Fora  
      61.
      Introduce Ingress URL templating Sub-task Closed Matyas Orhidi  
      62.
      Remove .sec from OperatorConfigOptions and use Duration type config instead Sub-task Closed Nicholas Jiang  
      63.
      E2E tests should cover different watchNamespace scenarios Sub-task Closed Márton Balassi  
      64.
      Rethink the default reschedule reconcile loop Sub-task Closed Unassigned  
      65.
      Individual upgradeMode change does not trigger a reconciliation Sub-task Closed Unassigned  
      66.
      Link operator doc site to Flink Website Sub-task Closed ZhengYu Chen
      67.
      The batch job not work well with Operator Sub-task Closed Unassigned  
      68.
      Move jobs to suspended state before upgrading Sub-task Closed Gyula Fora  
      69.
      Output more status info for JobObserver Sub-task Closed Biao Geng  
      70.
      Add sanity check for state.savepoints.dir when using savepoint upgrade mode Sub-task Closed Nicholas Jiang  
      71.
      Align the helm chart version with the flink operator Sub-task Closed Gyula Fora  

      Activity

        People

          thw Thomas Weise
          thw Thomas Weise
          Votes:
          0 Vote for this issue
          Watchers:
          21 Start watching this issue

          Dates

            Created:
            Updated:
            Resolved:

            Time Tracking

              Estimated:
              Original Estimate - Not Specified
              Not Specified
              Remaining:
              Time Spent - 2h Remaining Estimate - 2h
              2h
              Logged:
              Time Spent - 2h Remaining Estimate - 2h
              2h