Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
Hudi currently does not work with AWS Glue Catalog. The issue/exception it runs into has been reported here as well issue .
As mentioned in the issue, the reason for this is:
- Currently Hudi is interacting with Hive through two different ways:
- Creation of table statement is submitted directly to Hive via JDBC https://github.com/apache/incubator-hudi/blob/master/hudi-hive/src/main/java/org/apache/hudi/hive/HoodieHiveClient.java#L472 . Thus, Hive will internally create the right metastore client (i.e. Glue if hive.metastore.client.factory.class is set to com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory in hive-site)
- Whereas partition listing among other things are being done by directly calling hive metastore APIs using hive metastore client: https://github.com/apache/incubator-hudi/blob/master/hudi-hive/src/main/java/org/apache/hudi/hive/HoodieHiveClient.java#L240
- Now in Hudi code, standard specific implementation of the metastore client (not glue metastore client) is being instantiated: https://github.com/apache/incubator-hudi/blob/master/hudi-hive/src/main/java/org/apache/hudi/hive/HoodieHiveClient.java#L109 .
- Ideally this instantiation of metastore client should be left to Hive through https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L5045 for it to consider other implementations of metastore client that might be configured through hive.metastore.client.factory.class .
That is the reason that table gets created in Glue metastore, but while reading or scanning partitions it is talking to the local hive metastore where it does not find the table created.