Hey guys I think the metrics you introduced is absolutely deceiving, and has nothing to do with the throughput the benchmark is intended to measure.
"Test exec time" is the running time of the job, which includes the compute overhead: scheduling, cleanup, and retries if there were failed maps.
While we want to benchmark the average throughput of the actual data transfers on HDFS. You should see the implementation measures time of transfers only.
The formatting changes are fine. But I think "Total Throughput" should be removed.
The bug reported in
MAPREDUCE-6931 makes it invalid, but even if fixed it is still deceiving.
Also, DFSIO issues should be filed on HDFS jira. Then you should expect more prompt response.
Sorry last part was for the other jira. Please ignore.