-
Type:
Improvement
-
Status: Open
-
Priority:
Major
-
Resolution: Unresolved
-
Affects Version/s: 3.3.0
-
Fix Version/s: None
-
Component/s: block placement, hdfs-client
-
Labels:None
The default behavior of an HDFS write is to setup a pipeline. A file is broken into packets and sent through the pipeline. Pipelining provides good throughput, but latency suffers.
Allowing a client to specify a fan-out strategy allows the client to send the packets to the DataNodes concurrently instead of passing the packet through a pipeline serially.
# Pipeline C |-------> DN -------> DN -------> DN # Fan Out |-------> DN C |-------> DN |-------> DN
Also, if there's a 'min replication' of, for example, 2. The client only needs to wait for the first 2 ACKs before writing the next packet as long as the 2 ACKs are from different racks. The block placement rules may need to support this.
HBase requires this improved latency.