Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.4.3
-
None
Description
When read a hive orc partitioned table without spark schema properties , spark read all partitions and all files for infer schema.
Other settings: native orc mode ; convertMetastoreOrc = true.
And I think it can improved by pass partitionFilters to fileIndex.listFiles.
// code placeholder // org/apache/spark/sql/hive/HiveMetastoreCatalog.scala:238 val inferredSchema = fileFormat .inferSchema( sparkSession, options, fileIndex.listFiles(Nil, Nil).flatMap(_.files)) .map(mergeWithMetastoreSchema(relation.tableMeta.dataSchema, _))