Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-5578

Streams Task Assignor should consider the staleness of state directories when allocating tasks

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: streams
    • Labels:
      None

      Description

      During task assignment we use the presence of a state directory to assign precedence to which instances should be assigned the task. We first chose previous active tasks, but then fall back to the existence of a state dir. Unfortunately we don't take into account the recency of the data from the available state dirs. So in the case where a task has run on many instances, it may be that we chose an instance that has relatively old data.

      When doing task assignment we should take into consideration the age of the data in the state dirs. We could use the data from the checkpoint files to determine which instance is most up-to-date and attempt to assign accordingly (obviously making sure that tasks are still balanced across available instances)

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                damianguy Damian Guy
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: