Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
BlockManagers running on executors provide all logistics around block management. Before a BlockManager can be used, it has to be “initialized”. As a part of the initialization, BlockManager asks the BlockManagerMasterEndpoint to give it topology information. The BlockManagerMasterEndpoint is provided a pluggable interface that can be used to resolve a hostname to topology. This information is used to decorate the BlockManagerId. This happens at cluster start and whenever a new executor is added.
During replication, the BlockManager gets the list of all its peers in the form of a Seq[BlockManagerId]. We add a pluggable prioritizer that can be used to prioritize this list of peers based on topology information. Peers with higher priority occur first in the sequence and the BlockManager tries to replicate blocks in that order.
There would be default implementations for these pluggable interfaces that replicate the existing behavior of randomly choosing a peer.