[SPARK-24669] Managed table was not cleared of path after drop database cascade - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.3.0, 2.3.1
Fix Version/s: 2.3.4, 2.4.1, 3.0.0
Component/s: SQL
Labels:
None

Description

I can do the following in sequence

Create a managed table using path options
Drop the table via dropping the parent database cascade
Re-create the database and table with a different path
The new table shows data from the old path, not the new path

echo "first" > /tmp/first.csv
echo "second" > /tmp/second.csv
spark-shell
spark.version
res0: String = 2.3.0
spark.sql("create database foo")
spark.sql("create table foo.first (id string) using csv options (path='/tmp/first.csv')")
spark.table("foo.first").show()
+-----+
|   id|
+-----+
|first|
+-----+
spark.sql("drop database foo cascade")
spark.sql("create database foo")
spark.sql("create table foo.first (id string) using csv options (path='/tmp/second.csv')")
"note, the path is different now, pointing to second.csv, but still showing data from first file"
spark.table("foo.first").show()
+-----+
|   id|
+-----+
|first|
+-----+
"now, if I drop the table explicitly, instead of via dropping database cascade, then it will be the correct result"
spark.sql("drop table foo.first")
spark.sql("create table foo.first (id string) using csv options (path='/tmp/second.csv')")
spark.table("foo.first").show()
+------+
|    id|
+------+
|second|
+------+

Same sequence failed in 2.3.1 as well.

Attachments

Issue Links

links to

GitHub Pull Request #23905

Activity

People

Assignee:: Udbhav Agrawal

Reporter:: Dong Jiang

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 27/Jun/18 19:55

Updated:: 23/Mar/19 17:43

Resolved:: 06/Mar/19 17:24