In master branch, I tried to run TPC-DS queries (e.g Query27) at 200 GB scale. Initially I got the following exception (as FileScanRDD has been made the default in master branch)
When running with "spark.sql.sources.fileScan=false", following exception is thrown
TPC-DS dataset generator generates column names differently & maintains the mapping in hive metastore. This mapping is somehow broken in master causing these exceptions.
Creating this ticket as this used to work in earlier branches.