Details
Description
The return value of job.waitForCompletion() of the new MapReduce API (NUTCH-2375) must always be checked. If it's not true, the job has been failed or killed. Accordingly, the program
- should not proceed with further jobs/steps
- must clean-up temporary data, unlock CrawlDB, etc.
- exit with non-zero exit value, so that scripts running the crawl workflow can handle the failure
Attachments
Issue Links
- is caused by
-
NUTCH-2375 Upgrade the code base from org.apache.hadoop.mapred to org.apache.hadoop.mapreduce
- Closed
- supercedes
-
NUTCH-2161 Interrupted failed and/or killed tasks fail to clean up temp directories in HDFS
- Closed
-
NUTCH-1783 Cleanup temp folders in case of failures
- Closed
- links to