Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-24194

HadoopFsRelation cannot overwrite a path that is also being read from

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: In Progress
    • Minor
    • Resolution: Unresolved
    • 3.1.0
    • None
    • SQL
    • None
    • spark master

    Description

      When

      INSERT OVERWRITE TABLE territory_count_compare select * from territory_count_compare where shop_count!=real_shop_count
      

      And territory_count_compare is a table with parquet, there will be a error 

      Cannot overwrite a path that is also being read from

       

      And in file MetastoreDataSourceSuite.scala, there have a test case

       

       

      table(tableName).write.mode(SaveMode.Overwrite).insertInto(tableName)
      

       

      But when the table territory_count_compare is a common hive table, there is no error. 

      So I think the reason is when insert overwrite into hadoopfs relation with static partition, it first delete the partition in the output. But it should be the time when the job commited.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              yangz yangz
              Votes:
              2 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - 24h
                  24h
                  Remaining:
                  Remaining Estimate - 24h
                  24h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified