Description
There's an existing flag "spark.sql.files.ignoreCorruptFiles" and "spark.sql.files.ignoreMissingFiles" that will quietly ignore attempted reads from files that have been corrupted, but it still allows the query to fail on sequence files.
Being able to ignore corrupt record is useful in the scenarios that users want to query successfully in dirty data(mixed schema in one table).
We would like to add a "spark.sql.hive.ignoreCorruptRecord" to fill out the functionality.
Attachments
Issue Links
- links to
(4 links to)