Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-9328

INSERT INTO a S3 external table with no reduce phase results in FileNotFoundException

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Cannot Reproduce
    • 2.0.2-alpha
    • None
    • fs/s3
    • None
    • YARN, Hadoop 2.0.2-alpha
      Ubuntu

    Description

      With Yarn and Hadoop 2.0.2-alpha, hive 0.9.0.

      The destination is an S3 table, the source for the query is a small hive managed table.

      CREATE EXTERNAL TABLE payout_state_product (
      state STRING,
      product_id STRING,
      element_id INT,
      element_value DOUBLE,
      number_of_fields INT)
      ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
      STORED AS TEXTFILE
      LOCATION 's3://com.weatherbill.foo/bar/payout_state_product/';

      A simple query to copy the results from the hive managed table into a S3.

      hive> INSERT OVERWRITE TABLE payout_state_product
      SELECT * FROM payout_state_product_cached;

      Total MapReduce jobs = 2
      Launching Job 1 out of 2
      Number of reduce tasks is set to 0 since there's no reduce operator
      Starting Job = job_1360884012490_0014, Tracking URL = http://i-9ff9e9ef.us-east-1.production.climatedna.net:8088/proxy/application_1360884012490_0014/
      Kill Command = /usr/lib/hadoop/bin/hadoop job -Dmapred.job.tracker=i-9ff9e9ef.us-east-1.production.climatedna.net:8032 -kill job_1360884012490_0014
      Hadoop job information for Stage-1: number of mappers: 100; number of reducers: 0
      2013-02-22 19:15:46,709 Stage-1 map = 0%, reduce = 0%
      ...snip...
      2013-02-22 19:17:02,374 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 427.13 sec
      MapReduce Total cumulative CPU time: 7 minutes 7 seconds 130 msec
      Ended Job = job_1360884012490_0014
      Ended Job = -1776780875, job is filtered out (removed at runtime).
      Launching Job 2 out of 2
      Number of reduce tasks is set to 0 since there's no reduce operator
      java.io.FileNotFoundException: File does not exist: /tmp/hive-marc/hive_2013-02-22_19-15-31_691_7365912335285010827/-ext-10002/000000_0
      at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:782)
      at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat$OneFileInfo.<init>(CombineFileInputFormat.java:493)
      at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getMoreSplits(CombineFileInputFormat.java:284)
      at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:244)
      at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:69)
      at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:386)
      at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:352)
      at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.processPaths(CombineHiveInputFormat.java:419)
      at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:390)
      at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:479)
      at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:471)
      at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:366)
      at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218)
      at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:396)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1367)
      at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215)
      at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617)
      at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:396)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1367)
      at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612)
      at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:435)
      at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
      at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
      at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
      at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1326)
      at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1118)
      at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951)
      at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
      at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
      at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
      at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689)
      at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
      Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: /tmp/hive-marc/hive_2013-02-22_19-15-31_691_7365912335285010827/-ext-10002/000000_0)'
      FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask

      Attachments

        Activity

          People

            Unassigned Unassigned
            mlimotte Marc Limotte
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: