Description
I can do the following in sequence
- Create a managed table using path options
- Drop the table via dropping the parent database cascade
- Re-create the database and table with a different path
- The new table shows data from the old path, not the new path
echo "first" > /tmp/first.csv echo "second" > /tmp/second.csv spark-shell spark.version res0: String = 2.3.0 spark.sql("create database foo") spark.sql("create table foo.first (id string) using csv options (path='/tmp/first.csv')") spark.table("foo.first").show() +-----+ | id| +-----+ |first| +-----+ spark.sql("drop database foo cascade") spark.sql("create database foo") spark.sql("create table foo.first (id string) using csv options (path='/tmp/second.csv')") "note, the path is different now, pointing to second.csv, but still showing data from first file" spark.table("foo.first").show() +-----+ | id| +-----+ |first| +-----+ "now, if I drop the table explicitly, instead of via dropping database cascade, then it will be the correct result" spark.sql("drop table foo.first") spark.sql("create table foo.first (id string) using csv options (path='/tmp/second.csv')") spark.table("foo.first").show() +------+ | id| +------+ |second| +------+
Same sequence failed in 2.3.1 as well.
Attachments
Issue Links
- links to