Description
Selecting and then writing df containing hidden file metadata column `_metadata` into a file format like `parquet`, `delta` will still keep the internal `Attribute` metadata information. Then when reading those `parquet`, `delta` files again, it will actually break the code, because it wrongly thinks user data schema named `_metadata` is a hidden file source metadata column.
Reproducible code:
// prepare a file source df df.select("*", "_metadata") .write.format("parquet").save(path) spark.read.format("parquet").load(path) .select("*").show()