Description
We recently added (and enabled by default) table partition pruning for partitioned Hive tables converted to using TableFileCatalog. When the Hive configuration option hive.metastore.try.direct.sql is set to false, Hive will throw an exception for unsupported filter expressions. For example, attempting to filter on an integer partition column will throw a org.apache.hadoop.hive.metastore.api.MetaException.
I discovered this behavior because VideoAmp uses the CDH version of Hive with a Postgresql metastore DB. In this configuration, CDH sets hive.metastore.try.direct.sql to false by default, and queries that filter on a non-string partition column will fail. That would be a rather rude surprise for these Spark 2.1 users...
I'm not sure exactly what behavior we should expect, but I suggest that HiveClientImpl.getPartitionsByFilter catch this metastore exception and return all partitions instead. This is what Spark does for Hive 0.12 users, which does not support this feature at all.
Attachments
Issue Links
- relates to
-
SPARK-36137 HiveShim always fallback to getAllPartitionsOf regardless of whether directSQL is enabled in remote HMS
- Resolved
- links to