Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-17794

HCatLoader breaks when a member is added to a struct-column of a table

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • 2.2.0, 2.4.0, 3.0.0
    • None
    • HCatalog
    • None
    • HADOOPPF-13737

    Description

      When a table's schema evolves to add a new member to a struct column, Hive queries work fine, but HCatLoader breaks with the following trace:

      TaskAttempt 1 failed, info=
       Error: Failure while running task:org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: kite_composites_with_segments: Local Rearrange
       tuple
      {chararray}(false) - scope-555-> scope-974 Operator Key: scope-555): org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: gup: New For Each(false,false)
       bag
      - scope-548 Operator Key: scope-548): org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: gup_filtered: Filter
       bag
      - scope-522 Operator Key: scope-522): org.apache.pig.backend.executionengine.ExecException: ERROR 0: org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error converting read value to tuple
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:314)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287)
      at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POLocalRearrangeTez.getNextTuple(POLocalRearrangeTez.java:127)
      at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:376)
      at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:241)
      at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:362)
      at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
      at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:422)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679)
      at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
      at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
      at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)
      Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: gup: New For Each(false,false)
       bag
      - scope-548 Operator Key: scope-548): org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: gup_filtered: Filter
       bag
      - scope-522 Operator Key: scope-522): org.apache.pig.backend.executionengine.ExecException: ERROR 0: org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error converting read value to tuple
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:314)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:252)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:305)
      ... 17 more
      Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: gup_filtered: Filter
       bag
      - scope-522 Operator Key: scope-522): org.apache.pig.backend.executionengine.ExecException: ERROR 0: org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error converting read value to tuple
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:314)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNextTuple(POFilter.java:90)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:305)
      ... 19 more
      Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error converting read value to tuple
      at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POSimpleTezLoad.getNextTuple(POSimpleTezLoad.java:160)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:305)
      ... 21 more
      Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error converting read value to tuple
      at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
      at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:63)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:204)
      at org.apache.tez.mapreduce.lib.MRReaderMapReduce.next(MRReaderMapReduce.java:118)
      at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POSimpleTezLoad.getNextTuple(POSimpleTezLoad.java:140)
      ... 22 more
      Caused by: java.lang.IndexOutOfBoundsException: Index: 31, Size: 31
      at java.util.ArrayList.rangeCheck(ArrayList.java:653)
      at java.util.ArrayList.get(ArrayList.java:429)
      at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:468)
      at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:451)
      at org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:410)
      at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:468)
      at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:386)
      at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)
      ... 26 more
      

      When filling out values for columns, the HCatLoader should've filled out nulls for non-existent columns. A patch will be made available shortly.

      Attachments

        1. HIVE-17794.1.patch
          1 kB
          Mithun Radhakrishnan
        2. HIVE-17794.02.patch
          27 kB
          Mithun Radhakrishnan
        3. HIVE-17794.03.patch
          28 kB
          Mithun Radhakrishnan

        Issue Links

          Activity

            People

              mithun Mithun Radhakrishnan
              mithun Mithun Radhakrishnan
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: