Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.4.0
-
None
-
Reviewed
Description
We saw incredibly excessive logs in application master for a long running one with many task attempts. The log write rate is around 1MB/sec in some cases.
Most of the log entries were from the progress report such as the following ones.
2015-02-03 17:46:14,321 INFO [IPC Server handler 56 on 37661] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1422985365246_0001_m_000000_0 is : 0.15605757
2015-02-03 17:46:17,581 INFO [IPC Server handler 2 on 37661] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1422985365246_0001_m_000000_0 is : 0.4108217
2015-02-03 17:46:20,426 INFO [IPC Server handler 0 on 37661] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1422985365246_0001_m_000002_0 is : 0.06634143
2015-02-03 17:46:20,807 INFO [IPC Server handler 4 on 37661] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1422985365246_0001_m_000000_0 is : 0.55556506
2015-02-03 17:46:21,013 INFO [IPC Server handler 6 on 37661] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1422985365246_0001_m_000001_0 is : 0.21723115
Looks like the report interval is controlled by a hard-coded variable PROGRESS_INTERVAL as 3 seconds in class org.apache.hadoop.mapred.Task. We should allow users to set the appropriate progress interval for their applications.
Attachments
Attachments
Issue Links
- incorporates
-
MAPREDUCE-6740 Enforce mapreduce.task.timeout to be at least mapreduce.task.progress-report.interval
- Resolved
- relates to
-
MAPREDUCE-5124 AM lacks flow control for task events
- Resolved