Uploaded image for project: 'Samza'
  1. Samza
  2. SAMZA-921

Consolidate LocalityManager and TaskAssignmentManager

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      As part of the work and discussion around SAMZA-906 there were a couple observations about how locality information should be managed in Samza.

      1. There should be one locality manager that is ultimately responsible for mapping tasks to hosts. To do this, it may also need to manage separate intermediate mappings from task->container and container->host, though there are some contexts where both mappings will be needed. (e.g. some implementations of BalancingTaskNameGrouper)
      2. Locality information should be written centrally. This facilitates a broadcast-like system, in which one leader writes the coordinator stream once and that information is consumed by all non-leaders either directly or through the leader. This also has the advantage that the leader can better track changes like a decrease in container count and clean up the container->host mapping, whereas the containers cannot do this naturally.

      Why doesn't it already work this way?
      1. The locality manager writes the locality in the container in order to ensure that the task was actually executing before writing the locality to the coordinator stream. This prevents a scenario where a container is attempted and fails on multiple hosts, thrashing the locality information and losing the local state. However, after SAMZA-871 is implemented, the leader will have the means to determine that the container is running, which will enable the locality to be written centrally.
      2. Since the container->host mapping is written in the containers, it registers itself as a CoordinatorStreamManager with "source" specific to the container doing the writing. Conversely, the TaskAssignmentManager writes all mappings centrally, so it uses one "TaskAssignmentManager" source. If, however the locality manager were updated to also write centrally, they could be combined and register with one "LocalityManager" source, which enables consolidation of the currently-separate CoordinatorStreamManager implementations (LocalityManager and TaskAssignmentManager)

      So, after SAMZA-871 we should be able to combine the LocalityManager and TaskAssignmentManager into one class that writes everything centrally, but still preserves the guarantee that the container-host mapping is not written until the container is actually running. That work is the goal of this task.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              jmakes Jake Maes
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: