Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-11632

Make TaskManager automatic bind address picking more explicit (by default) and more configurable

    XMLWordPrintableJSON

Details

    Description

      Currently, there is an optional taskmanager.host configuration option in flink-conf.yaml that allows users of Flink to "statically" pre-define what should be a bind address for TaskManager to listen on (note: it's also possible to override this option by passing corresponding command line option to Flink).

      In case when the option is not set, TaskManager would try heuristically pick up a bind address.

      The resulting address (hostname) is used to advertise different service endpoints (running in TM) to the JobManager. Also it would be resolved to an InetAddress later that used as binding address for TMs inner node communication.

      This proposal is to minimize usage of heuristics (by default) by introducing a new configuration option (for example, taskmanager.host.bind-policy) with possible values:

      • "hostname" - default, use TM's host's name (== InetAddress.getLocalHost().getHostName();
      • "ip" - use TM's host's ip address (== InetAddress.getLocalHost().getHostAddress());
      • "auto-detect-hostname" - use the heuristics based detection mechanism.

      Note: the configuration key and values could be named better and open for proposals.
      Note 2: in the future, the configuration option may require to be extended to allow choosing some specific network interface, or preference of ipv6 vs ipv4.

      Rationale

      The heuristics mechanism tries to establish a probe connection to jobmanager.rpc.address from different network interface addresses.
      In case of parallel setups (when JM and multiple TMs start simultaneously, in parallel), this depends on timing, assigned network ip addresses and may end up with "non-uniform" address bindings of TMs (some may be "lucky" to pick up non default network interface, some would fallback to InetAddress.getLocalHost().getHostName(). At the end, it's less obvious and transparent which binding address a TM picks up.

      In practice, it's possible that in majority of cases (in well setup environments) the heuristics mechanism returns a result that matches InetAddress.getLocalHost(). The proposal is to stick with this more simpler and explicit binding (by default), avoiding non-determinism of heuristics.

      The old mechanism is kept available, in case if it is useful in some setups. But would require explicit configuration setting.

      Additionally, this proposal extends "auto configuration" option by allowing users to choose the host's ip address (instead of hostname). This may be convenient in situations where the TMs' machines are not necessary reachable via DNS (for example in a Kubernetes setup).

      Attachments

        Issue Links

          Activity

            People

              1u0 Alex
              1u0 Alex
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m