Type: New Feature
Affects Version/s: None
Fix Version/s: None
When an applications upload/download files from/to HDFS clusters, it would be nice if the IO could be throttled so that they won't go beyond the specified maximum bandwidth.
Two options to implement this IO throttling:
#1. IO Throttling happens at the FSDataInputStream and FSDataOutputStream level.
Add an IO Throttler to FSDataInputStream/FSDataOutputStram, and whenever an read/write happens, throttle it first(if throttler is set), then do the actual read/write.
We may need to add new FileSystem apis to take an IO throttler as input parameter.
#2. IO Throttling happens at the application level.
Instead of changing the FSDataInputStream/FSDataOutputStream, all IO throttling is done at the application level.
In this approach, FileSystem api remains unchanged.
Either case, an IO throttler interface is needed, which has a:
public void throttle(long numOfBytes);
The current DataTransferThrottler could be an implementation of this IO throttler interface.