Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
The motivation is well described in HADOOP-11252.
The RPC client has a default timeout set to 0 when no timeout is passed in. This means that the network connection created will not timeout when used to write data. The issue has shown in
YARN-2578andHDFS-4858. Timeouts for writes then fall back to the tcp level retry (configured via tcp_retries2) and timeouts between the 15-30 minutes. Which is too long for a default behaviour.
However, HADOOP-11252 didn't set the default value to a meaningful timeout(it is zero, which means infinity). User will still hit this issue by default. Maybe we should set the default value to a meaningful one.