Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-5971

About full schema evolution, alter column type do not support nest column but can alter inside struct type

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.12.2, 0.13.0
    • None
    • spark-sql

    Description

      reproduce steps:

      • create the test table:
      CREATE TABLE default.schema_evolution_test (
                                id bigint,
                                test_array array<int>,
                                test_array_struct ARRAY<STRUCT<name:STRING, age:INT>>)
                              USING hudi
                              TBLPROPERTIES (
                                type = 'cow',
                                primaryKey = 'id')
                              LOCATION 's3://trino-ci-test/schema_evolution_test' 
      • enable the full schema evolution
      SET hoodie.schema.on.read.enable=true
      • alter the nested table column type, there is exception
      SQL A: alter table default.schema_evolution_test alter column test_array_struct type ARRAY<STRUCT<name:STRING, age:FLOAT>>;
      Exception: Failed in [alter table default.schema_evolution_test alter column test_array_struct type ARRAY<STRUCT<name:STRING, age:FLOAT>>]
      java.lang.IllegalArgumentException: only support update primitive type but find nest column: test_array_struct
              at org.apache.hudi.internal.schema.action.TableChanges$ColumnUpdateChange.updateColumnType(TableChanges.java:89)
              at org.apache.spark.sql.hudi.command.AlterTableCommand.$anonfun$applyUpdateAction$1(AlterTableCommand.scala:149)
        
      • alter the inner struct type, it can execute successfully
      SQL B: alter table default.schema_evolution_test alter column test_array_struct.element.age type FLOAT;
      • Describe the table schema;
      desc default.schema_evolution_test;                              
      id                      bigint                                                            
      test_array              array<int>                                  
      test_array_struct       array<struct<name:string,age:float>>    
      • Questions:
        1. is this a bug of alter table column type for Hudi?
        2. Both SQL A and SQL B is used to change the type from array<struct<name:string,age:int>> to array<struct<name:string,age:float>>, why SQL A do not supported?
        3. SQL B is worked as expected or not? and SQL A will be supported in next release or future?

      Attachments

        Activity

          People

            Unassigned Unassigned
            Alberic Alberic Liu
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: