Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-17085

ORC file merge/concatenation should do full schema check

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.2.0, 2.3.0, 3.0.0
    • Fix Version/s: 3.0.0, 2.4.0
    • Component/s: ORC
    • Labels:
      None

      Description

      ORC merging/concatenation compatibility check just looks for column count match at outer level. ORC schema evolution now supports inner structs as well. With that outer level column count will match but inner column level will not match. Compatibility check should do full schema match before merging/concatenation. This issue will not cause data loss but will cause task failures with exception like below

      Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to close OrcFileMergeOperator
      	at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:247)
      	at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.processKeyValuePairs(OrcFileMergeOperator.java:172)
      	at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.process(OrcFileMergeOperator.java:72)
      	at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:212)
      	... 16 more
      Caused by: java.lang.IllegalArgumentException: Column has wrong number of index entries found: 0 expected: 1
      	at org.apache.orc.impl.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:695)
      	at org.apache.orc.impl.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:2147)
      	at org.apache.orc.impl.WriterImpl.flushStripe(WriterImpl.java:2661)
      	at org.apache.orc.impl.WriterImpl.close(WriterImpl.java:2834)
      	at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:321)
      	at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:243)
      	... 19 more
      

      Concatenation should also make sure writer version is matching (it currently checks only file version match).

        Attachments

        1. HIVE-17085.1.patch
          24 kB
          Prasanth Jayachandran
        2. HIVE-17085.2.patch
          38 kB
          Prasanth Jayachandran

          Issue Links

            Activity

              People

              • Assignee:
                prasanth_j Prasanth Jayachandran
                Reporter:
                prasanth_j Prasanth Jayachandran
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: