Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-14281

Multi-level rack awareness

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • 3.4.0
    • None
    • core

    Description

      Motivation

      With replication services data can be replicated across independent Kafka clusters in multiple data center. In addition, many customers need "stretch clusters" - a single Kafka cluster that spans across multiple data centers. This architecture has the following useful characteristics:

      • Data is natively replicated into all data centers by Kafka topic replication.
      • No data is lost when 1 DC is lost and no configuration change is required - design is implicitly relying on native Kafka replication.
      • From operational point of view, it is much easier to configure and operate such a topology than a replication scenario via MM2.

      Kafka should provide "native" support for stretch clusters, covering any special aspects of operations of stretch cluster.

      Multi-level rack awareness

      Additionally, stretch clusters are implemented using the rack awareness feature, where each DC is represented as a rack. This ensures that replicas are spread across DCs evenly. Unfortunately, there are cases where this is too limiting - in case there are actual racks inside the DCs, we cannot specify those. Consider having 3 DCs with 2 racks each:

      /DC1/R1, /DC1/R2
      /DC2/R1, /DC2/R2
      /DC3/R1, /DC3/R2

      If we were to use racks as DC1, DC2, DC3, we lose the rack-level information of the setup. This means that it is possible that when we are using RF=6, that the 2 replicas assigned to DC1 will both end up in the same rack.

      If we were to use racks as /DC1/R1, /DC1/R2, etc, then when using RF=3, it is possible that 2 replicas end up in the same DC, e.g. /DC1/R1, /DC1/R2, /DC2/R1.

      Because of this, Kafka should support "multi-level" racks, which means that rack IDs should be able to describe some kind of a hierarchy. With this feature, brokers should be able to:

      1. spread replicas evenly based on the top level of the hierarchy (i.e. first, between DCs)
      2. then inside a top-level unit (DC), if there are multiple replicas, they should be spread evenly among lower-level units (i.e. between racks, then between physical hosts, and so on)
        1. repeat for all levels

      Attachments

        Activity

          People

            viktorsomogyi Viktor Somogyi-Vass
            viktorsomogyi Viktor Somogyi-Vass
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: