Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.1.1
-
None
-
None
Description
When you attempt to rerun an Apache Spark write operation by cancelling the currently running job, the following error occurs:
Error: org.apache.spark.sql.AnalysisException: Cannot create the managed table('`testdb`.` testtable`').
The associated location ('dbfs:/user/hive/warehouse/testdb.db/metastore_cache_ testtable) already exists.;
This problem can occur if:
- The cluster is terminated while a write operation is in progress.
- A temporary network issue occurs.
- The job is interrupted.
You can reproduce the problem by following these steps:
1. Create a DataFrame:
val df = spark.range(1000)
2. Write the DataFrame to a location in overwrite mode:
df.write.mode(SaveMode.Overwrite).saveAsTable("testdb.testtable")
3. Cancel the command while it is executing.
4. Re-run the write command.