Uploaded image for project: 'Samza'
  1. Samza
  2. SAMZA-1334

When host-affinity is turned off, ContainerAllocator should ignore any previous container locality info

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • None
    • None

    Description

      Consider a case where host affinity is turned on once for a job, and the locality info is written to the coordinator stream. Then the user may turn off the host affinity feature.

      That triggers a bug in ContainerAllocator:
      1) it gets the locality map from JobModel which has the list of preferred hosts from the coordinator stream. Hence, ContainerAllocator is making preferred host resource requests.
      2) At the end, ContainerAllocator finishes launching all containers and tries to release all extra containers mapping to ANY_HOST. However, all preferred host resource responses are kept under the specific host's entry. Hence, it failed to release those containers.

      The end result is: the job is still successfully launched. However, YARN RM reports a lot of reserved memory/containers not released by the job. In some extreme cases, the reserved memory/container can be huge and affects the availability of the whole cluster.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              nickpan47 Yi Pan
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: