Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-19015

Vectorization and Parquet: When vectorized, parquet_map_of_arrays_of_ints.q gets a ClassCastException

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: 3.0.0
    • Fix Version/s: None
    • Component/s: Hive
    • Labels:
      None

      Description

      Adding "SET hive.vectorized.execution.enabled=true;" to parquet_map_of_arrays_of_ints.q triggers this call stack:

      Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.typeinfo.ListTypeInfo cannot be cast to org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo
      	at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.readBatch(VectorizedListColumnReader.java:67) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
      	at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedMapColumnReader.readBatch(VectorizedMapColumnReader.java:57) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
      	at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:410) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
      	at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
      	at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
      	at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
      

      FYI: Vihang Karajgaonkar

      Adding parquet_map_of_maps.q, too. Stack trace seems related.

      Caused by: java.lang.ClassCastException: optional group value (MAP) {
        repeated group key_value {
          optional binary key (UTF8);
          required int32 value;
        }
      } is not primitive
      	at org.apache.parquet.schema.Type.asPrimitiveType(Type.java:213) ~[parquet-hadoop-bundle-1.9.0.jar:1.9.0]
      	at org.apache.hadoop.hive.ql.io.parquet.vector.BaseVectorizedColumnReader.<init>(BaseVectorizedColumnReader.java:130) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
      	at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.<init>(VectorizedListColumnReader.java:52) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
      	at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:568) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
      	at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
      	at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
      	at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
      	at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
      	at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                jerrychenhf Haifeng Chen
                Reporter:
                mmccline Matt McCline
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated: