Description
After HDFS-6268 the sorting order of block locations is deterministic for a given block and locality level (e.g.: local, rack. off-rack), so off-rack clients all see the same datanode for the same block. This leads to very poor behavior in distributed cache localization and other scenarios where many clients all want the same block data at approximately the same time. The one datanode is crushed by the load while the other replicas only handle local and rack-local requests.
Attachments
Attachments
Issue Links
- duplicates
-
HDFS-4253 block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance
- Resolved
- is broken by
-
HDFS-6268 Better sorting in NetworkTopology#pseudoSortByDistance when no local node is found
- Closed
- relates to
-
HADOOP-11107 Improve tests to not rely on JDK random implementation
- Open
-
HDFS-6701 Make seed optional in NetworkTopology#sortByDistance
- Closed