Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-3267

HCatStorer fail in limit query

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.9.2, 0.10.1, 0.11.1
    • 0.12.0, 0.11.2
    • None
    • None
    • Patch Available
    • Reviewed

    Description

      The following query fail:

      data = LOAD 'student.txt' as (name:chararray, age:int, gpa:double);
      data_limited = limit data 10;
      samples = foreach data_limited generate age as number;
      store samples into 'samples' using org.apache.hcatalog.pig.HCatStorer('part_dt=20130101T010000T36');
      

      Error happens before launching the second job. Error message:

      Message: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://localhost:8020/user/hive/warehouse/samples/part_dt=20130101T010000T36 already exists
      	at org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:121)
      	at org.apache.hcatalog.mapreduce.FileOutputFormatContainer.checkOutputSpecs(FileOutputFormatContainer.java:135)
      	at org.apache.hcatalog.mapreduce.HCatBaseOutputFormat.checkOutputSpecs(HCatBaseOutputFormat.java:72)
      	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecsHelper(PigOutputFormat.java:207)
      	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecs(PigOutputFormat.java:188)
      	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:887)
      	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:396)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
      	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
      	at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
      	at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
      	at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      	at java.lang.reflect.Method.invoke(Method.java:597)
      	at org.apache.pig.backend.hadoop20.PigJobControl.mainLoopAction(PigJobControl.java:157)
      	at org.apache.pig.backend.hadoop20.PigJobControl.run(PigJobControl.java:134)
      	at java.lang.Thread.run(Thread.java:680)
      	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:257)
      

      Attachments

        1. PIG-3267-1.patch
          2 kB
          Daniel Dai

        Issue Links

          Activity

            People

              daijy Daniel Dai
              daijy Daniel Dai
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: