Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-22178

Parquet FilterPredicate throws CastException after SchemaEvolution.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.1.0
    • 4.0.0-alpha-1
    • None
    • None

    Description

      Below are the repro steps.

      create table parq_test(age int, name string) stored as parquet;
      insert into parq_test values(1, 'aaaa');
      alter table parq_test change age age string;
      insert into parq_test values('b', 'bbbb');
      select * from parq_test where age='b';

      Exception thrown after changing column datatype is below

      Caused by: java.lang.IllegalArgumentException: FilterPredicate column: age's declared type (org.apache.parquet.io.api.Binary) does not match the schema found in file metadata. Column age is of type: INT32
      Valid types for this column are: [class java.lang.Integer]
       at org.apache.parquet.filter2.predicate.ValidTypeMap.assertTypeValid(ValidTypeMap.java:126)
       at org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.validateColumn(SchemaCompatibilityValidator.java:181)
       at org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.validateColumnFilterPredicate(SchemaCompatibilityValidator.java:151)
       at org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:85)
       at org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:58)
       at org.apache.parquet.filter2.predicate.Operators$Eq.accept(Operators.java:181)
       at org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.validate(SchemaCompatibilityValidator.java:63)
       at org.apache.parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:92)
       at org.apache.parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:43)
       at org.apache.parquet.filter2.compat.FilterCompat$FilterPredicateCompat.accept(FilterCompat.java:137)
       at org.apache.parquet.filter2.compat.RowGroupFilter.filterRowGroups(RowGroupFilter.java:64)
       at org.apache.hadoop.hive.ql.io.parquet.ParquetRecordReaderBase.getSplit(ParquetRecordReaderBase.java:111)
       at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.<init>(VectorizedParquetRecordReader.java:147)
       ... 31 more

      Attachments

        1. HIVE-22178.3.patch
          20 kB
          Naresh P R
        2. HIVE-22178.2.patch
          20 kB
          Naresh P R
        3. HIVE-22178.1.patch
          18 kB
          Naresh P R

        Activity

          People

            nareshpr Naresh P R
            nareshpr Naresh P R
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: