Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-17087

Use constant port for rest.port when it's set as 0 on Kubernetes

    XMLWordPrintableJSON

Details

    Description

      If people set rest.port to 0 when deploying a native K8s session cluster as the following command does,

      ./bin/kubernetes-session.sh -Dkubernetes.cluster-id=felix1 -Drest.port=0 ...
      

      the submission client will throw an Exception as follows:

       

      org.apache.flink.client.deployment.ClusterDeploymentException: Could not create Kubernetes cluster felix1
      at org.apache.flink.kubernetes.KubernetesClusterDescriptor.deployClusterInternal(KubernetesClusterDescriptor.java:189)
      at org.apache.flink.kubernetes.KubernetesClusterDescriptor.deploySessionCluster(KubernetesClusterDescriptor.java:129)
      at org.apache.flink.kubernetes.cli.KubernetesSessionCli.run(KubernetesSessionCli.java:108)
      at org.apache.flink.kubernetes.cli.KubernetesSessionCli.lambda$main$0(KubernetesSessionCli.java:185)
      at org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
      at org.apache.flink.kubernetes.cli.KubernetesSessionCli.main(KubernetesSessionCli.java:185)

      Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://xxx/apis/apps/v1/namespaces/default/deployments. Message: Deployment.apps "felix1" is invalid: spec.template.spec.containers[0].ports[0].containerPort: Required value. Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[StatusCause(field=spec.template.spec.containers[0].ports[0].containerPort, message=Required value, reason=FieldValueRequired, additionalProperties={})], group=apps, kind=Deployment, name=felix1, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Deployment.apps "felix1" is invalid: spec.template.spec.containers[0].ports[0].containerPort: Required value, metadata=ListMeta(_continue=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={}).
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:510)
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:449)
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:413)
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:372)
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:241)
      at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:798)
      at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:328)
      at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:324)
      at org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.createJobManagerComponent(Fabric8FlinkKubeClient.java:83)
      at org.apache.flink.kubernetes.KubernetesClusterDescriptor.deployClusterInternal(KubernetesClusterDescriptor.java:184)
      ... 5 more

       

      As we can see, the exception message is unintuitive and may confuse a variety of users. 

      Therefore, this ticket proposes to use a fixed port instead if people set it as 0, like what we have done for the blob.server.port and the taskmanager.rpc.port.

      Attachments

        Activity

          People

            Unassigned Unassigned
            felixzheng Canbin Zheng
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: