Affects Version/s: 2.0.0-alpha
Fix Version/s: 2.0.2-alpha
Currently, HDFS exposes on which datanodes a block resides, which allows clients to make scheduling decisions for locality and load balancing. Extending this to also expose on which disk on a datanode a block resides would enable even better scheduling, on a per-disk rather than coarse per-datanode basis.
This API would likely look similar to Filesystem#getFileBlockLocations, but also involve a series of RPCs to the responsible datanodes to determine disk ids.
|Status||Open [ 1 ]||Patch Available [ 10002 ]|
|Status||Patch Available [ 10002 ]||Resolved [ 5 ]|
|Hadoop Flags||Reviewed [ 10343 ]|
|Fix Version/s||2.2.0-alpha [ 12322472 ]|
|Resolution||Fixed [ 1 ]|
|Status||Resolved [ 5 ]||Closed [ 6 ]|