Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
1.4.1, 1.6.2
-
None
-
None
-
Important
Description
According to the API documentation, the write mode overwrite should overwrite the existing data, which suggests that the data is removed, i.e. the table is truncated.
However, that is now what happens in the source code:
if (mode == SaveMode.Overwrite && tableExists) { JdbcUtils.dropTable(conn, table) tableExists = false }
This clearly shows that the table is first dropped and then recreated. This causes two major issues:
- Existing indexes, partitioning schemes, etc. are completely lost.
- The case of identifiers may be changed without the user understanding why.
In my opinion, the table should be truncated, not dropped. Overwriting data is a DML operation and should not cause DDL.
Attachments
Issue Links
- duplicates
-
SPARK-16463 Support `truncate` option in Overwrite mode for JDBC DataFrameWriter
- Resolved
- is duplicated by
-
SPARK-13699 Spark SQL drops the table in "overwrite" mode while writing into table
- Resolved
- links to