Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Cannot Reproduce
-
2.2.1
-
None
-
I discovered this on AWS Glue, which uses Spark 2.2.1
Description
Spark can load multiple CSV files in one read:
df = spark.read.format("csv").option("header", "true").option("inferSchema", "true").load("/*.csv")
However, if one of these files is empty (though it has a header), Spark will set all column types to "String"
Spark should skip a file for inference if it contains no (non-header) rows