Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-36562 Improve InsertIntoHadoopFsRelation file commit logic
  3. SPARK-32838

Connot insert overwite different partition with same table

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • 3.0.0
    • None
    • SQL
    • None
    • hadoop 2.7 + spark 3.0.0

    Description

      When:

      CREATE TABLE tmp.spark3_snap (
      id string
      )
      PARTITIONED BY (dt string)
      STORED AS ORC
      ;
      
      insert overwrite table tmp.spark3_snap partition(dt='2020-09-09')
      select 10;
      insert overwrite table tmp.spark3_snap partition(dt='2020-09-10')
      select 1;
      
      insert overwrite table tmp.spark3_snap partition(dt='2020-09-10')
      select id from tmp.spark3_snap where dt='2020-09-09';
      

      and it will be get a error: "Cannot overwrite a path that is also being read from"

      related: https://issues.apache.org/jira/browse/SPARK-24194

      This work on spark 2.4.3 and do not work on spark 3.0.0
       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              CHENXCHEN CHC
              Votes:
              1 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated: