Details
Description
The Fetcher class defines a "hard" timeout defined as 50% of the MapReduce task timeout, see mapreduce.task.timeout and fetcher.threads.timeout.divisor. If there are fetcher threads running but without any progress during the timeout period (in terms of newly started fetch items), Fetcher is shut down to avoid that the task timeout is reached and the fetcher job is failed. The "hung threads" are logged together with the URL being fetched and (DEBUG level) the Java stack.
In addition to logging, a job counter should indicate the number of hung threads. This would allow to see on the job level whether there are issues with hung threads. To trace the issues it's still required to look into the Hadoop task logs.
Attachments
Issue Links
- links to