Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-19677

TaskManager takes abnormally long time to register with JobManager on Kubernetes

    XMLWordPrintableJSON

Details

    Description

      During the registration process of TaskManager, JobManager would create a 

      TaskManagerLocation instance, which tries to get hostname of the TaskManager via reverse DNS lookup.

      However, this always fails in Kubernetes environment, because for pods that are not exposed by Services, their IPs cannot be resolved to domains by coredns, and InetAddress#getCanonicalHostName() would take ~5 seconds to return, blocking the whole registration process.

      Therefore Flink should provide a configuration parameter to turn off reverse DNS lookup. Also, even when hostname is actually needed, this could be done lazily to avoid blocking registration of other TaskManagers.

      Attachments

        Activity

          People

            kyledong Weike Dong
            kyledong Weike Dong
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: