Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
0.23.0
-
None
-
Hadoop version is: Hadoop 0.23.0.1110031628
10 node test cluster
-
Reviewed
-
Corrected MR AM to honor speculative configuration and enable speculating either maps or reduces.
Description
When forcing a mapper to take significantly longer than other map tasks, speculative map tasks are
launched even if the mapreduce.job.maps.speculative.execution parameter is set to 'false'.
Testcase: ran default WordCount job with spec execution set to false for both map and reduce but still saw a fifth mapper
task launch, ran job as follows:
hadoop --config <config> jar /tmp/testphw/wordcount.jar WordCount
-Dmapreduce.job.maps.speculative.execution=false -Dmapreduce.job.reduces.speculative.execution=false
/tmp/test_file_of_words* /tmp/file_of_words.out
Input data was 4 text files >hdfs blocksize, with same word pattern plus one diff text line in each file, fourth
file was 4 times as large as others:
hadoop --config <config> fs -ls /tmp
Found 5 items
drwxr-xr-x - user hdfs 0 2011-10-20 16:17 /tmp/file_of_words.out
rw-rr- 3 user hdfs 62800021 2011-10-20 14:45 /tmp/test_file_of_words1
rw-rr- 3 user hdfs 62800024 2011-10-20 14:46 /tmp/test_file_of_words2
rw-rr- 3 user hdfs 62800024 2011-10-20 14:46 /tmp/test_file_of_words3
rw-rr- 3 user hdfs 271708312 2011-10-20 15:50 /tmp/test_file_of_words4
Job launched 5 mappers despite spec exec set to false, output snippet:
org.apache.hadoop.mapreduce.JobCounter
NUM_FAILED_MAPS=1
TOTAL_LAUNCHED_MAPS=5
TOTAL_LAUNCHED_REDUCES=1
RACK_LOCAL_MAPS=5
SLOTS_MILLIS_MAPS=273540
SLOTS_MILLIS_REDUCES=212876
Reran same case as above only set both spec exec params to 'true', same results only this time the fifth task being
launched is expected since spec exec = true.
job run:
hadoop --config <config> jar /tmp/testphw/wordcount.jar WordCount
-Dmapreduce.job.maps.speculative.execution=true -Dmapreduce.job.reduces.speculative.execution=true
/tmp/test_file_of_words* /tmp/file_of_words.out
output snippet:
org.apache.hadoop.mapreduce.JobCounter
NUM_FAILED_MAPS=1
TOTAL_LAUNCHED_MAPS=5
TOTAL_LAUNCHED_REDUCES=1
RACK_LOCAL_MAPS=5
SLOTS_MILLIS_MAPS=279653
SLOTS_MILLIS_REDUCES=211474
I have two corrections to make and three observation.
mapreduce.job.maps.speculative.execution.
mapreduce.job.reduces.speculative.execution.
works as expected on the command line:
off speculative map execution.
possible to turn off speculative map execution on the command line. For some reason, setting
mapreduce.reduce.speculative to true in the config file overrides the command line value of mapreduce.map.speculative