[MAPREDUCE-4357] Snappy Codec does not load properly when m/r job is run in "uber" mode - ASF JIRA

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 2.0.0-alpha
Fix Version/s: None
Component/s: None
Labels:
None

Description

sudo -u hdfs hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-2.0.0-cdh4.0.0-tests.jar TestDFSIO -write
12/06/01 18:17:11 INFO fs.TestDFSIO: TestDFSIO.0.0.6
12/06/01 18:17:11 INFO fs.TestDFSIO: nrFiles = 1
12/06/01 18:17:11 INFO fs.TestDFSIO: fileSize (MB) = 1.0
12/06/01 18:17:11 INFO fs.TestDFSIO: bufferSize = 1000000
12/06/01 18:17:11 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
12/06/01 18:17:11 INFO fs.TestDFSIO: creating control file: 1048576 bytes, 1 files
12/06/01 18:17:12 INFO fs.TestDFSIO: created control files for: 1 files
12/06/01 18:17:12 INFO mapred.FileInputFormat: Total input paths to process : 1
12/06/01 18:17:12 INFO mapreduce.JobSubmitter: number of splits:1
12/06/01 18:17:12 WARN conf.Configuration: mapred.jar is deprecated. Instead, use mapreduce.job.jar
12/06/01 18:17:12 WARN conf.Configuration: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
12/06/01 18:17:12 WARN conf.Configuration: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
12/06/01 18:17:12 WARN conf.Configuration: mapred.job.name is deprecated. Instead, use mapreduce.job.name
12/06/01 18:17:12 WARN conf.Configuration: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
12/06/01 18:17:12 WARN conf.Configuration: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
12/06/01 18:17:12 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
12/06/01 18:17:12 WARN conf.Configuration: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
12/06/01 18:17:12 WARN conf.Configuration: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
12/06/01 18:17:12 INFO mapred.ResourceMgrDelegate: Submitted application application_1338599410922_0004 to ResourceManager at /0.0.0.0:8032
12/06/01 18:17:12 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1338599410922_0004/
12/06/01 18:17:12 INFO mapreduce.Job: Running job: job_1338599410922_0004
12/06/01 18:17:17 INFO mapreduce.Job: Job job_1338599410922_0004 running in uber mode : true
12/06/01 18:17:17 INFO mapreduce.Job: map 0% reduce 0%
12/06/01 18:17:17 INFO mapreduce.Job: Job job_1338599410922_0004 failed with state FAILED due to:
12/06/01 18:17:17 INFO mapreduce.Job: Counters: 11
Job Counters
Failed map tasks=1
Failed reduce tasks=1
Launched map tasks=1
Launched reduce tasks=1
Other local map tasks=1
Total time spent by all maps in occupied slots (ms)=2456
Total time spent by all reduces in occupied slots (ms)=136
TOTAL_LAUNCHED_UBERTASKS=2
NUM_UBER_SUBMAPS=1
NUM_UBER_SUBREDUCES=1
NUM_FAILED_UBERTASKS=2
java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:883)
at org.apache.hadoop.fs.TestDFSIO.runIOTest(TestDFSIO.java:340)
at org.apache.hadoop.fs.TestDFSIO.writeTest(TestDFSIO.java:321)
at org.apache.hadoop.fs.TestDFSIO.run(TestDFSIO.java:520)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.fs.TestDFSIO.main(TestDFSIO.java:445)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
at org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:112)
at org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:120)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

Attaching my mapred-site.xml as well.
I have snappy compression enabled but I never see it get loaded in the logs.

cat /etc/hadoop/conf.pseudo/mapred-site.xml
<?xml version="1.0"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:8021</value>
</property>

<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

<property>
<description>To set the value of tmp directory for map and reduce tasks.</description>
<name>mapreduce.task.tmp.dir</name>
<value>/var/lib/hadoop-mapreduce/cache/${user.name}/tasks</value>
</property>

<property>
<name>mapreduce.job.ubertask.enable</name>
<value>true</value>
<description>Run very small jobs in a single JVM</description>
</property>

<property>
<name>mapreduce.map.output.compress</name>
<value>true</value>
<description>Should the outputs of the maps be compressed before being sent
across the network.
</description>
</property>

<property>
<name>mapreduce.map.output.compress.codec</name>
<value>org.apache.hadoop.io.compress.SnappyCodec</value>
</property>
</configuration>

in non uber mode it works fine

sudo -u hdfs hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-2.0.0-cdh4.0.0-tests.jar TestDFSIO -Dmapreduce.job.ubertask.enable=false -write

Snappy Codec does not load properly when m/r job is run in "uber" mode

Details

Description

Attachments

Activity

People

Dates