Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17861 Store data source partitions in metastore and push partition pruning into metastore
  3. SPARK-17992

HiveClient.getPartitionsByFilter throws an exception for some unsupported filters when hive.metastore.try.direct.sql=false

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.1.0
    • Component/s: SQL
    • Labels:
      None

      Description

      We recently added (and enabled by default) table partition pruning for partitioned Hive tables converted to using TableFileCatalog. When the Hive configuration option hive.metastore.try.direct.sql is set to false, Hive will throw an exception for unsupported filter expressions. For example, attempting to filter on an integer partition column will throw a org.apache.hadoop.hive.metastore.api.MetaException.

      I discovered this behavior because VideoAmp uses the CDH version of Hive with a Postgresql metastore DB. In this configuration, CDH sets hive.metastore.try.direct.sql to false by default, and queries that filter on a non-string partition column will fail. That would be a rather rude surprise for these Spark 2.1 users...

      I'm not sure exactly what behavior we should expect, but I suggest that HiveClientImpl.getPartitionsByFilter catch this metastore exception and return all partitions instead. This is what Spark does for Hive 0.12 users, which does not support this feature at all.

        Attachments

          Activity

            People

            • Assignee:
              michael Michael MacFadden
              Reporter:
              michael Michael MacFadden
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: