Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-29295

Duplicate result when dropping partition of an external table and then overwriting

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.4.4
    • Fix Version/s: 3.0.0
    • Component/s: SQL
    • Labels:
      None

      Description

      When we drop a partition of a external table and then overwrite it, if we set CONVERT_METASTORE_PARQUET=true(default value), it will overwrite this partition.
      But when we set CONVERT_METASTORE_PARQUET=false, it will give duplicate result.

      Here is a reproduce code below(you can add it into SQLQuerySuite in hive module):

        test("spark gives duplicate result when dropping a partition of an external partitioned table" +
          " firstly and they overwrite it") {
          withTable("test") {
            withTempDir { f =>
              sql("create external table test(id int) partitioned by (name string) stored as " +
                s"parquet location '${f.getAbsolutePath}'")
      
              withSQLConf(HiveUtils.CONVERT_METASTORE_PARQUET.key -> false.toString) {
                sql("insert overwrite table test partition(name='n1') select 1")
                sql("ALTER TABLE test DROP PARTITION(name='n1')")
                sql("insert overwrite table test partition(name='n1') select 2")
                checkAnswer( sql("select id from test where name = 'n1' order by id"),
                  Array(Row(1), Row(2)))
              }
      
              withSQLConf(HiveUtils.CONVERT_METASTORE_PARQUET.key -> true.toString) {
                sql("insert overwrite table test partition(name='n1') select 1")
                sql("ALTER TABLE test DROP PARTITION(name='n1')")
                sql("insert overwrite table test partition(name='n1') select 2")
                checkAnswer( sql("select id from test where name = 'n1' order by id"),
                  Array(Row(2)))
              }
            }
          }
        }
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                viirya L. C. Hsieh
                Reporter:
                hzfeiwang feiwang
              • Votes:
                1 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: