Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-39484

V2 write for type struct fails to handle case sensitivity on field names during resolution of V2 write command

    XMLWordPrintableJSON

Details

    • Bug
    • Status: In Progress
    • Minor
    • Resolution: Unresolved
    • 3.1.1, 3.2.1
    • None
    • SQL
    • None
    • master, 3.1.1

    Description

      Summary:

      When a V2 write uses an input with a struct type which contains differences in the casing of field names, the caseSensitive config is not being honored, always doing a strict case sensitive comparison.

      Repro:

      CREATE TABLE tmp.test_table_to (key int, object struct<shardId:int>) USING ICEBERG;
      CREATE TABLE tmp.test_table_from (key int, object struct<shardid:int>) USING HIVE;
      INSERT OVERWRITE tmp.test_table_to SELECT 1 as key, object FROM tmp.test_table_from;

      The above results in Exception:

      Error in query: unresolved operator 'OverwriteByExpression RelationV2[key#3, object#4] spark_catalog.tmp.test_table_to, true, false;
      'OverwriteByExpression RelationV2[key#3, object#4] spark_catalog.tmp.test_table_to, true, false
      +- Project [1 AS key#0, object#2]
         +- SubqueryAlias spark_catalog.tmp.test_table_from
            +- HiveTableRelation [`tmp`.`test_table_from`, org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe, Data Cols: [key#1, object#2], Partition Cols: []]

       

      If the casing matches in the struct field names, the v2 write works as expected.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              erod Edgar Rodriguez
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: