HCatalog
  1. HCatalog
  2. HCATALOG-380

If pig script does load then order by, hive-site.xml doesn't seem to propagate properly

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.4, 0.5, 0.4.1
    • Fix Version/s: 0.4.1
    • Component/s: pig
    • Labels:
      None

      Description

      Table is not partitioned and has an RCFile for the data.

      The following pig script will cause the MR jobs to fail:
      a = load 'default.table' USING org.apache.hcatalog.pig.HCatLoader();
      b = order a by id;
      dump b;

      If I prefix the order by with a foreach statement, then the job will pass.

      The MR job fails with the following exception:

      Error: java.lang.ClassNotFoundException: javax.jdo.JDOException at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:346) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:333) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:371) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:278) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.<init>(HiveMetaStore.java:248) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:114) at org.apache.hcatalog.mapreduce.InitializeInput.createHiveMetaClient(InitializeInput.java:58) at org.apache.hcatalog.mapreduce.InitializeInput.getSerializedHcatKeyJobInfo(InitializeInput.java:85) at org.apache.hcatalog.mapreduce.InitializeInput.setInput(InitializeInput.java:73) at org.apache.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:40) at org.apache.hcatalog.pig.HCatLoader.setLocation(HCatLoader.java:116) at org.apache.pig.impl.builtin.SampleLoader.setLocation(SampleLoader.java:98) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.mergeSplitSpecificConf(PigInputFormat.java:134) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:112) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:489) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) at org.apache.hadoop.mapred.Child.main(Child.java:249)

        Issue Links

          Activity

          Francis Liu made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Fix Version/s 0.4.1 [ 12320255 ]
          Resolution Fixed [ 1 ]
          Hide
          Francis Liu added a comment -

          checked into trunk and branch-0.4

          Show
          Francis Liu added a comment - checked into trunk and branch-0.4
          Hide
          Daniel Dai added a comment -

          +1

          Show
          Daniel Dai added a comment - +1
          Francis Liu made changes -
          Attachment HCATALOG-380.patch [ 12526230 ]
          Hide
          Francis Liu added a comment -

          Need to add a patch to support Pig behavior of having the same signature for SampleLoader and actual loadFunc. HCatLoader exposes this problem since it caches information when setLocation() is called the first time and merely repopulats from the cache on future calls. Because of this we hit an issue with the non-sample MR job reusing the same credentials as the sample MR job which will cause the 2nd job to fail because the credentials has expired.

          Show
          Francis Liu added a comment - Need to add a patch to support Pig behavior of having the same signature for SampleLoader and actual loadFunc. HCatLoader exposes this problem since it caches information when setLocation() is called the first time and merely repopulats from the cache on future calls. Because of this we hit an issue with the non-sample MR job reusing the same credentials as the sample MR job which will cause the 2nd job to fail because the credentials has expired.
          Hide
          Francis Liu added a comment -

          problem should be addressed by PIG-2666 waiting for resolution on that before determining if there is anything else that needs to be done.

          Show
          Francis Liu added a comment - problem should be addressed by PIG-2666 waiting for resolution on that before determining if there is anything else that needs to be done.
          Hide
          Francis Liu added a comment -

          agreed, it will be cleaner, would we be able to checkin the fix in a 0.9 release?

          Show
          Francis Liu added a comment - agreed, it will be cleaner, would we be able to checkin the fix in a 0.9 release?
          Hide
          Daniel Dai added a comment -

          Seems solving it in Pig side is cleaner, can you try the patch on PIG-2666?

          Show
          Daniel Dai added a comment - Seems solving it in Pig side is cleaner, can you try the patch on PIG-2666 ?
          Francis Liu made changes -
          Link This issue is related to PIG-2666 [ PIG-2666 ]
          Francis Liu made changes -
          Assignee Francis Liu [ toffer ]
          David Capwell made changes -
          Field Original Value New Value
          Affects Version/s 0.5 [ 12320147 ]
          David Capwell created issue -

            People

            • Assignee:
              Francis Liu
              Reporter:
              David Capwell
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development