Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-15352 Topology aware block replication
  3. SPARK-15353

Making peer selection for block replication pluggable

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.1.0
    • Block Manager, Spark Core
    • None

    Description

      BlockManagers running on executors provide all logistics around block management. Before a BlockManager can be used, it has to be “initialized”. As a part of the initialization, BlockManager asks the BlockManagerMasterEndpoint to give it topology information. The BlockManagerMasterEndpoint is provided a pluggable interface that can be used to resolve a hostname to topology. This information is used to decorate the BlockManagerId. This happens at cluster start and whenever a new executor is added.
      During replication, the BlockManager gets the list of all its peers in the form of a Seq[BlockManagerId]. We add a pluggable prioritizer that can be used to prioritize this list of peers based on topology information. Peers with higher priority occur first in the sequence and the BlockManager tries to replicate blocks in that order.
      There would be default implementations for these pluggable interfaces that replicate the existing behavior of randomly choosing a peer.

      Attachments

        1. BlockManagerSequenceDiagram.png
          28 kB
          Shubham Chopra

        Activity

          People

            shubhamc Shubham Chopra
            shubhamc Shubham Chopra
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: