To minimize the number of map fetch failures reported by reducers across an NM restart it would be nice if reducers only reported a fetch failure after trying for at specified period of time to retrieve the data.
- breaks
-
MAPREDUCE-6303 Read timeout when retrying a fetch error can be fatal to a reducer
-
- Closed
-
- is related to
-
MAPREDUCE-6156 Fetcher - connect() doesn't handle connection refused correctly
-
- Closed
-
-
YARN-666 [Umbrella] Support rolling upgrades in YARN
-
- Closed
-
- relates to
-
YARN-1336 [Umbrella] Work-preserving nodemanager restart
-
- Resolved
-