TaskAttemptListenerImpl.statusUpdate() causes a bloating in log files. One every call, the listener uses LOG.info() to printout the progress of the TaskAttempt.
After discussing this issue with Nathan Roberts, Eric Badger, and Eric Payne, we thought that while it is helpful to have a log print of task progress, it is still excessive to log the progress in every update.
This Jira is to suppress the excessive logging from TaskAttemptListener without affecting the frequency of progress updates.
There are two flags:
- -Dmapreduce.task.log.progress.delta.threshold=0.10: means that the task progress will be logged every 10% of delta progress. Default is 5%.
- -Dmapreduce.task.log.progress.wait.interval-seconds=120: means that if the listener will log the progress every 2 minutes. This is helpful for long running tasks that take long time to achieve the delta threshold. Default is 1 minute.
The listener will long whichever of delta.threshold and wait.interval-seconds is reached first.
Enabling LOG.DEBUG for TaskAttemptListenerImpl will override those two flags and log the task progress on every update.