[HADOOP-4041] IsolationRunner does not work as documented - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.18.0
Fix Version/s: 0.21.0
Component/s: documentation
Labels:
None

Hadoop Flags:

Reviewed
Release Note:
Fixed a bug in IsolationRunner to make it work for map tasks.

Description

IsolationRunner does not work as documented in the tutorial.

The tutorial says "To use the IsolationRunner, first set keep.failed.tasks.files to true (also see keep.tasks.files.pattern)."

Should be:
keep.failed.task.files (not tasks)

After the above was set (quoted from my message on hadoop-core):
> After the task
> hung, I failed it via the web interface. Then I went to the node that was
> running this task
>
> $ cd ...local/taskTracker/jobcache/job_200808071645_0001/work
> (this path is already different from the tutorial's)
>
> $ hadoop org.apache.hadoop.mapred.IsolationRunner ../job.xml
> Exception in thread "main" java.lang.NullPointerException
> at
> org.apache.hadoop.mapred.IsolationRunner.main(IsolationRunner.java:164)
>
> Looking at IsolationRunner code, I see this:
>
> 164 File workDirName = new File(lDirAlloc.getLocalPathToRead(
> 165 TaskTracker.getJobCacheSubdir()
> 166 + Path.SEPARATOR + taskId.getJobID()
> 167 + Path.SEPARATOR + taskId
> 168 + Path.SEPARATOR + "work",
> 169 conf). toString());
>
> I.e. it assumes there is supposed to be a taskID subdirectory under the job
> dir, but:
> $ pwd
> ...mapred/local/taskTracker/jobcache/job_200808071645_0001
> $ ls
> jars job.xml work
>
> – it's not there.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

hadoop-4041.patch
02/Dec/08 17:36
10 kB
Thomas White
HADOOP-4041-v2.patch
25/May/09 22:50
39 kB
Philip Martin
HADOOP-4041-v3.patch
27/May/09 16:32
38 kB
Philip Martin
HADOOP-4041-v4.patch
29/May/09 23:15
39 kB
Philip Martin
HADOOP-4041-v4-y20.patch
15/Jan/10 13:38
39 kB
Hemanth Yamijala
org.apache.hadoop.fs.LocalDirAllocator.html
01/Jun/09 04:18
6 kB
Philip Martin

Issue Links

is related to

MAPREDUCE-234 Isolation runner needs a testcase

Resolved

relates to

MAPREDUCE-1325 Fix IsolationRunner to run with reduces too

Resolved

Activity

People

Assignee:: Philip Martin

Reporter:: Yuri Pradkin

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 28/Aug/08 16:17

Updated:: 24/Aug/10 20:34

Resolved:: 16/Jun/09 01:18