Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-13325

Excessive logging when ORC PPD fails type conversions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.3.0, 2.1.0
    • 1.3.0, 2.1.0
    • Logging, ORC
    • None

    Description

      Timestamp was specified as "YYYY-MM-DD HH:MM:SS": 2016-01-23 00:00:00

      2016-02-10 02:15:43,175 [WARN] [TezChild] |orc.RecordReaderImpl|: Exception when evaluating predicate. Skipping ORC PPD. Exception: java.lang.IllegalArgumentException: ORC SARGS could not convert from String to TIMESTAMP
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.getBaseObjectForComparison(RecordReaderImpl.java:659)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.evaluatePredicateRange(RecordReaderImpl.java:373)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.evaluatePredicateProto(RecordReaderImpl.java:338)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$SargApplier.pickRowGroups(RecordReaderImpl.java:710)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.pickRowGroups(RecordReaderImpl.java:751)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readStripe(RecordReaderImpl.java:777)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:986)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1019)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.<init>(RecordReaderImpl.java:205)
              at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:598)
              at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$ReaderPair.<init>(OrcRawRecordMerger.java:183)
              at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$OriginalReaderPair.<init>(OrcRawRecordMerger.java:226)
              at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.<init>(OrcRawRecordMerger.java:437)
              at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1269)
              at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1151)
              at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:249)
              at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:193)
              at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.<init>(TezGroupedSplitsInputFormat.java:135)
              at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:101)
              at org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:149)
              at org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:80)
              at org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:650)
              at org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:621)
              at org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:145)
              at org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:109)
              at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:406)
              at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:128)
              at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
              at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
              at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
              at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181)
              at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
              at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
              at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
              at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
              at java.util.concurrent.FutureTask.run(FutureTask.java:262)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
              at java.lang.Thread.run(Thread.java:745)
      

      Timestamp specified as unix_ts (seconds): 1453507200

      2016-02-10 02:20:06,588 [WARN] [TezChild] |orc.RecordReaderImpl|: Exception when evaluating predicate. Skipping ORC PPD. Exception: java.lang.IllegalArgumentException: ORC SARGS could not convert from Integer to TIMESTAMP
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.getBaseObjectForComparison(RecordReaderImpl.java:659)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.evaluatePredicateRange(RecordReaderImpl.java:373)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.evaluatePredicateProto(RecordReaderImpl.java:338)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$SargApplier.pickRowGroups(RecordReaderImpl.java:710)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.pickRowGroups(RecordReaderImpl.java:751)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readStripe(RecordReaderImpl.java:777)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:986)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1019)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.<init>(RecordReaderImpl.java:205)
              at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:598)
              at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$ReaderPair.<init>(OrcRawRecordMerger.java:183)
              at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$OriginalReaderPair.<init>(OrcRawRecordMerger.java:226)
              at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.<init>(OrcRawRecordMerger.java:437)
              at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1269)
              at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1151)
              at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:249)
              at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:193)
              at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.<init>(TezGroupedSplitsInputFormat.java:135)
              at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:101)
              at org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:149)
              at org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:80)
              at org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:650)
              at org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:621)
              at org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:145)
              at org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:109)
              at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:406)
              at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:128)
              at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
              at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
              at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
              at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181)
              at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
              at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
              at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
              at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
              at java.util.concurrent.FutureTask.run(FutureTask.java:262)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
              at java.lang.Thread.run(Thread.java:745)
      

      These messages are logged for each and every row group processed in tasks.

      Attachments

        1. HIVE-13325.2.patch
          2 kB
          Prasanth Jayachandran
        2. HIVE-13325.1.patch
          2 kB
          Prasanth Jayachandran

        Activity

          People

            prasanth_j Prasanth Jayachandran
            sseth Siddharth Seth
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: