Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-3927

TaskManager registration may fail if Yarn versions don't match

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.1.0
    • 1.1.0
    • Runtime / Coordination
    • None

    Description

      Flink's ResourceManager uses the Yarn container ids to identify connecting task managers. Yarn's stringified container id may not be consistent across different Hadoop versions, e.g. Hadoop 2.3.0 and Hadoop 2.7.1. The ResourceManager gets it from the Yarn reports while the TaskManager infers it from the Yarn environment variables. The ResourceManager may use Hadoop 2.3.0 version while the cluster runs Hadoop 2.7.1.

      The solution is to pass the ID through a custom environment variable which is set by the ResourceManager before launching the TaskManager in the container. That way we will always use the Hadoop client's id generation method.

      Attachments

        Issue Links

          Activity

            People

              mxm Maximilian Michels
              mxm Maximilian Michels
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: