[MAPREDUCE-5655] Remote job submit from windows to a linux hadoop cluster fails due to wrong classpath - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: 2.2.0, 2.3.0
Fix Version/s: None
Component/s: client, job submission
Labels:
None
Environment:

Client machine is a Windows 7 box, with Eclipse
Remote: there is a multi node hadoop cluster, installed on Ubuntu boxes (any linux)

Description

I was trying to run a java class on my client, windows 7 developer environment, which submits a job to the remote Hadoop cluster, initiates a mapreduce there, and then downloads the results back to the local machine.

General use case is to use hadoop services from a web application installed on a non-cluster computer, or as part of a developer environment.

The problem was, that the ApplicationMaster's startup shell script (launch_container.sh) was generated with wrong CLASSPATH entry. Together with the java process call on the bottom of the file, these entries were generated in windows style, using % as shell variable marker and ; as the CLASSPATH delimiter.

I tracked down the root cause, and found that the MrApps.java, and the YarnRunner.java classes create these entries, and is passed forward to the ApplicationMaster, assuming that the OS that runs these classes will match the one running the ApplicationMaster. But it's not the case, these are in 2 different jvm, and also the OS can be different, the strings are generated based on the client/submitter side's OS.

I made some workaround changes to these 2 files, so i could launch my job, however there may be more problems ahead.

update
error message:
13/12/04 16:33:15 INFO mapreduce.Job: Job job_1386170530016_0001 failed with state FAILED due to: Application application_1386170530016_0001 failed 2 times due to AM Container for appattempt_1386170530016_0001_000002 exited with exitCode: 1 due to: Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job control

at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)

update2:
It also reqires to add the following property to
mapred-site.xml (or mapred-default.xml), on the windows box, so that the job launcher knows, that the job runner will be a linux:
<property>
<name>mapred.remote.os</name>
<value>Linux</value>
<description>Remote MapReduce framework's OS, can be either Linux or Windows</description>
</property

without this entry, the patched jar does the same as the unpatched, so it's required to work!

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

MRApps.patch
04/Dec/13 13:44
5 kB
Attila Pados
YARNRunner.patch
04/Dec/13 13:44
0.7 kB
Attila Pados

Issue Links

duplicates

MAPREDUCE-4052 Windows eclipse cannot submit job from Windows client to Linux/Unix Hadoop cluster.

Closed

Activity

People

Assignee:: JoneZhang

Reporter:: Attila Pados

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 26/Nov/13 13:16

Updated:: 10/Apr/14 04:16

Resolved:: 02/Dec/13 18:35