Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-36352 Spark should check result plan's output schema name
  3. SPARK-36086

The case of the delta table is inconsistent with parquet

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.1.1
    • 3.2.0, 3.1.3, 3.0.4
    • SQL
    • None

    Description

      How to reproduce this issue:

      1. Add delta-core_2.12-1.0.0-SNAPSHOT.jar to ${SPARK_HOME}/jars.
      2. bin/spark-shell --conf spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension --conf spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog
      
      spark.sql("create table t1 using parquet as select id, id as lower_id from range(5)")
      spark.sql("CREATE VIEW v1 as SELECT * FROM t1")
      spark.sql("CREATE TABLE t2 USING DELTA PARTITIONED BY (LOWER_ID) SELECT LOWER_ID, ID FROM v1")
      spark.sql("CREATE TABLE t3 USING PARQUET PARTITIONED BY (LOWER_ID) SELECT LOWER_ID, ID FROM v1")
      
      spark.sql("desc extended t2").show(false)
      spark.sql("desc extended t3").show(false)
      
      scala> spark.sql("desc extended t2").show(false)
      +----------------------------+--------------------------------------------------------------------------+-------+
      |col_name                    |data_type                                                                 |comment|
      +----------------------------+--------------------------------------------------------------------------+-------+
      |lower_id                    |bigint                                                                    |       |
      |id                          |bigint                                                                    |       |
      |                            |                                                                          |       |
      |# Partitioning              |                                                                          |       |
      |Part 0                      |lower_id                                                                  |       |
      |                            |                                                                          |       |
      |# Detailed Table Information|                                                                          |       |
      |Name                        |default.t2                                                                |       |
      |Location                    |file:/Users/yumwang/Downloads/spark-3.1.1-bin-hadoop2.7/spark-warehouse/t2|       |
      |Provider                    |delta                                                                     |       |
      |Table Properties            |[Type=MANAGED,delta.minReaderVersion=1,delta.minWriterVersion=2]          |       |
      +----------------------------+--------------------------------------------------------------------------+-------+
      
      
      scala> spark.sql("desc extended t3").show(false)
      +----------------------------+--------------------------------------------------------------------------+-------+
      |col_name                    |data_type                                                                 |comment|
      +----------------------------+--------------------------------------------------------------------------+-------+
      |ID                          |bigint                                                                    |null   |
      |LOWER_ID                    |bigint                                                                    |null   |
      |# Partition Information     |                                                                          |       |
      |# col_name                  |data_type                                                                 |comment|
      |LOWER_ID                    |bigint                                                                    |null   |
      |                            |                                                                          |       |
      |# Detailed Table Information|                                                                          |       |
      |Database                    |default                                                                   |       |
      |Table                       |t3                                                                        |       |
      |Owner                       |yumwang                                                                   |       |
      |Created Time                |Mon Jul 12 14:07:16 CST 2021                                              |       |
      |Last Access                 |UNKNOWN                                                                   |       |
      |Created By                  |Spark 3.1.1                                                               |       |
      |Type                        |MANAGED                                                                   |       |
      |Provider                    |PARQUET                                                                   |       |
      |Location                    |file:/Users/yumwang/Downloads/spark-3.1.1-bin-hadoop2.7/spark-warehouse/t3|       |
      |Serde Library               |org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe               |       |
      |InputFormat                 |org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat             |       |
      |OutputFormat                |org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat            |       |
      |Partition Provider          |Catalog                                                                   |       |
      +----------------------------+--------------------------------------------------------------------------+-------+
      

      Attachments

        Activity

          People

            angerszhuuu angerszhu
            yumwang Yuming Wang
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: