Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
v2.6.4
-
None
-
EMR 5.23(hadoop 2.8.5\HBase 1.4.9\hive 2.3.4\Spark 2.4.0\Tez 0.9.1\HCatalog 2.3.4\Zookeeper 3.4.13)
kylin 2.6.4
Description
hi,
I Build kylin on EMR 5.23. The kylin version is 2.6.4.When building the cube, the hive table cannot be found.The detailed error information is as follows:
java.lang.RuntimeException: java.io.IOException: NoSuchObjectException(message:kylin_flat_db_test1.kylin_intermediate_kylin_sales_cube_4e93b31d_3be2_c9e8_55de_a9814f63c4ba table not found)java.lang.RuntimeException: java.io.IOException: NoSuchObjectException(message:kylin_flat_db_test1.kylin_intermediate_kylin_sales_cube_4e93b31d_3be2_c9e8_55de_a9814f63c4ba table not found) at org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:83) at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:126) at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:104) at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:131) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167) at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167) at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
On the EMR, hive metadata is shared by glue, and the URL of Metastore is configured in hive-site.xml.
<name>hive.metastore.uris</name>
<value>thrift://ip-172-40-15-164.ec2.internal:9083</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>hive.metastore.client.factory.class</name>
<value>com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory</value>
</property>
But when I use hive's own metadata, that is, don't use glue to share metadata, the above exception will not occur, comment out the following configuration.
<!--<property>
<name>hive.metastore.client.factory.class</name>
<value>com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory</value>
</property>
-->
But since EMR uses shared metadata, if you don't use metadata sharing, then I can't query other hive tables built by the cluster.
The configuration file is detailed in the attachment. Please help me solve this problem.Thank you。
Best regard.
Note:
For anyone who interested in Glue support, https://issues.apache.org/jira/browse/KYLIN-3685?focusedCommentId=17002995&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17002995 is also another verifed workaroud. You may check kaige's comment in link.
Attachments
Attachments
Issue Links
- Blocked
-
KYLIN-3685 AWS Glue Catalog Not Supported
- Resolved
- fixes
-
KYLIN-3685 AWS Glue Catalog Not Supported
- Resolved
- links to