Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4357

Snappy Codec does not load properly when m/r job is run in "uber" mode

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.0.0-alpha
    • None
    • None
    • None

    Description

      1. sudo -u hdfs hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-2.0.0-cdh4.0.0-tests.jar TestDFSIO -write
        12/06/01 18:17:11 INFO fs.TestDFSIO: TestDFSIO.0.0.6
        12/06/01 18:17:11 INFO fs.TestDFSIO: nrFiles = 1
        12/06/01 18:17:11 INFO fs.TestDFSIO: fileSize (MB) = 1.0
        12/06/01 18:17:11 INFO fs.TestDFSIO: bufferSize = 1000000
        12/06/01 18:17:11 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
        12/06/01 18:17:11 INFO fs.TestDFSIO: creating control file: 1048576 bytes, 1 files
        12/06/01 18:17:12 INFO fs.TestDFSIO: created control files for: 1 files
        12/06/01 18:17:12 INFO mapred.FileInputFormat: Total input paths to process : 1
        12/06/01 18:17:12 INFO mapreduce.JobSubmitter: number of splits:1
        12/06/01 18:17:12 WARN conf.Configuration: mapred.jar is deprecated. Instead, use mapreduce.job.jar
        12/06/01 18:17:12 WARN conf.Configuration: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
        12/06/01 18:17:12 WARN conf.Configuration: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
        12/06/01 18:17:12 WARN conf.Configuration: mapred.job.name is deprecated. Instead, use mapreduce.job.name
        12/06/01 18:17:12 WARN conf.Configuration: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
        12/06/01 18:17:12 WARN conf.Configuration: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
        12/06/01 18:17:12 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
        12/06/01 18:17:12 WARN conf.Configuration: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
        12/06/01 18:17:12 WARN conf.Configuration: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
        12/06/01 18:17:12 INFO mapred.ResourceMgrDelegate: Submitted application application_1338599410922_0004 to ResourceManager at /0.0.0.0:8032
        12/06/01 18:17:12 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1338599410922_0004/
        12/06/01 18:17:12 INFO mapreduce.Job: Running job: job_1338599410922_0004
        12/06/01 18:17:17 INFO mapreduce.Job: Job job_1338599410922_0004 running in uber mode : true
        12/06/01 18:17:17 INFO mapreduce.Job: map 0% reduce 0%
        12/06/01 18:17:17 INFO mapreduce.Job: Job job_1338599410922_0004 failed with state FAILED due to:
        12/06/01 18:17:17 INFO mapreduce.Job: Counters: 11
        Job Counters
        Failed map tasks=1
        Failed reduce tasks=1
        Launched map tasks=1
        Launched reduce tasks=1
        Other local map tasks=1
        Total time spent by all maps in occupied slots (ms)=2456
        Total time spent by all reduces in occupied slots (ms)=136
        TOTAL_LAUNCHED_UBERTASKS=2
        NUM_UBER_SUBMAPS=1
        NUM_UBER_SUBREDUCES=1
        NUM_FAILED_UBERTASKS=2
        java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:883)
        at org.apache.hadoop.fs.TestDFSIO.runIOTest(TestDFSIO.java:340)
        at org.apache.hadoop.fs.TestDFSIO.writeTest(TestDFSIO.java:321)
        at org.apache.hadoop.fs.TestDFSIO.run(TestDFSIO.java:520)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
        at org.apache.hadoop.fs.TestDFSIO.main(TestDFSIO.java:445)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
        at org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:112)
        at org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:120)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

      Attaching my mapred-site.xml as well.
      I have snappy compression enabled but I never see it get loaded in the logs.

      cat /etc/hadoop/conf.pseudo/mapred-site.xml
      <?xml version="1.0"?>
      <!--
      Licensed to the Apache Software Foundation (ASF) under one or more
      contributor license agreements. See the NOTICE file distributed with
      this work for additional information regarding copyright ownership.
      The ASF licenses this file to You under the Apache License, Version 2.0
      (the "License"); you may not use this file except in compliance with
      the License. You may obtain a copy of the License at

      http://www.apache.org/licenses/LICENSE-2.0

      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License.
      -->
      <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

      <configuration>
      <property>
      <name>mapred.job.tracker</name>
      <value>localhost:8021</value>
      </property>

      <property>
      <name>mapreduce.framework.name</name>
      <value>yarn</value>
      </property>

      <property>
      <description>To set the value of tmp directory for map and reduce tasks.</description>
      <name>mapreduce.task.tmp.dir</name>
      <value>/var/lib/hadoop-mapreduce/cache/${user.name}/tasks</value>
      </property>

      <property>
      <name>mapreduce.job.ubertask.enable</name>
      <value>true</value>
      <description>Run very small jobs in a single JVM</description>
      </property>

      <property>
      <name>mapreduce.map.output.compress</name>
      <value>true</value>
      <description>Should the outputs of the maps be compressed before being sent
      across the network.
      </description>
      </property>

      <!-- This is set since the disks were not really being stressed much, but
      CPUs are. If this is generating too much disk I/O, it can be set back to
      DefaultCodec (deflate) -->
      <property>
      <name>mapreduce.map.output.compress.codec</name>
      <value>org.apache.hadoop.io.compress.SnappyCodec</value>
      </property>
      </configuration>

      in non uber mode it works fine

      sudo -u hdfs hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-2.0.0-cdh4.0.0-tests.jar TestDFSIO -Dmapreduce.job.ubertask.enable=false -write

      Attachments

        Activity

          People

            Unassigned Unassigned
            jlord Jeff Lord
            Votes:
            1 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated: