Details
Description
ORC merging/concatenation compatibility check just looks for column count match at outer level. ORC schema evolution now supports inner structs as well. With that outer level column count will match but inner column level will not match. Compatibility check should do full schema match before merging/concatenation. This issue will not cause data loss but will cause task failures with exception like below
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to close OrcFileMergeOperator at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:247) at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.processKeyValuePairs(OrcFileMergeOperator.java:172) at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.process(OrcFileMergeOperator.java:72) at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:212) ... 16 more Caused by: java.lang.IllegalArgumentException: Column has wrong number of index entries found: 0 expected: 1 at org.apache.orc.impl.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:695) at org.apache.orc.impl.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:2147) at org.apache.orc.impl.WriterImpl.flushStripe(WriterImpl.java:2661) at org.apache.orc.impl.WriterImpl.close(WriterImpl.java:2834) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:321) at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:243) ... 19 more
Concatenation should also make sure writer version is matching (it currently checks only file version match).
Attachments
Attachments
Issue Links
- links to