Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.7.0
-
None
-
Reviewed
-
Introduced two new configuration dfs.webhdfs.netty.low.watermark and dfs.webhdfs.netty.high.watermark to enable tuning the size of the buffers of the Netty server inside Datanodes.
Description
There is an issue that appears related to the webhdfs server. When making two concurrent requests, the DN will sometimes pause for extended periods (I've seen 1-300 seconds), killing performance and dropping connections.
To reproduce:
1. set up a HDFS cluster
2. Upload a large file (I was using 10GB). Perform 1-byte reads, writing
the time out to /tmp/times.txt
i=1 while (true); do echo $i let i++ /usr/bin/time -f %e -o /tmp/times.txt -a curl -s -L -o /dev/null "http://<namenode>:50070/webhdfs/v1/tmp/bigfile?op=OPEN&user.name=root&length=1"; done
3. Watch for 1-byte requests that take more than one second:
tail -F /tmp/times.txt | grep -E "^[^0]"
4. After it has had a chance to warm up, start doing large transfers from
another shell:
i=1 while (true); do echo $i let i++ /usr/bin/time -f %e curl -s -L -o /dev/null "http://<namenode>:50070/webhdfs/v1/tmp/bigfile?op=OPEN&user.name=root"; done
It's easy to find after a minute or two that small reads will sometimes
pause for 1-300 seconds. In some extreme cases, it appears that the
transfers timeout and the DN drops the connection.