Uploaded image for project: 'Samza'
  1. Samza
  2. SAMZA-1334

When host-affinity is turned off, ContainerAllocator should ignore any previous container locality info

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Consider a case where host affinity is turned on once for a job, and the locality info is written to the coordinator stream. Then the user may turn off the host affinity feature.

      That triggers a bug in ContainerAllocator:
      1) it gets the locality map from JobModel which has the list of preferred hosts from the coordinator stream. Hence, ContainerAllocator is making preferred host resource requests.
      2) At the end, ContainerAllocator finishes launching all containers and tries to release all extra containers mapping to ANY_HOST. However, all preferred host resource responses are kept under the specific host's entry. Hence, it failed to release those containers.

      The end result is: the job is still successfully launched. However, YARN RM reports a lot of reserved memory/containers not released by the job. In some extreme cases, the reserved memory/container can be huge and affects the availability of the whole cluster.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                nickpan47 Yi Pan
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: