Description
In this ticket https://issues.apache.org/jira/browse/SPARK-39469, we introduced the support of date type in CSV schema inference. The schema inference behavior on date time columns now is:
- For a column only containing dates, we will infer it as Date type
- For a column only containing timestamps, we will infer it as Timestamp type
- For a column containing a mixture of dates and timestamps, we will infer it as Timestamp type
However, we found that we are too ambitious on the last scenario, to support which we have introduced much complexity in code and caused a lot of performance concerns. Thus, we want to simplify and correct the behavior of the last scenario as:
- For a column containing a mixture of dates and timestamps
- If user specifies timestamp format, it will always be inferred as `StringType`
- If no timestamp format specified by user, we will try inferring it as `TimestampType` if possible, otherwise it will be inferred as `StringType`