[KYLIN-5646] The build job reports an error at the step of detecting time partition columns in the Yarn Cluster mode - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 5.0-alpha
Fix Version/s: 5.0-beta
Component/s: Tools, Build and Test
Labels:
None

Description

When building Spark YARN-Cluster mode, when detecting incremental time partition columns, initializing KylinConfig reports an error Didn't find KYLIN_HOME or KYLIN_HOME

Reproduce method

Build the partition table model incrementally using Spark YARN_Cluster mode, and set kylin.engine.check-partition-col-enabled=true (the default value is true)

Root Cause

Modified the autoSetShufflePartitions of the pushdown query in ~~KYLIN-5571~~, no need to execute when the pre-modification build task detects the delta time column format (only the pushdown query is executed)

After modification, autoSetShufflePartitions is executed asynchronously, the following two methods will get KylinConfig through KylinConfig.getInstanceFromEnv,

At this time, the asynchronous execution of the new thread cannot use the built KylinConfig, so the KylinConfig will be initialized,

However, the build task jvm and the KE main process are not the same machine, and KYLIN_CONF and KYLIN_HOME cannot be obtained, so the build task fails to run

ResourceDetectUtils.getResourceSizeWithTimeoutByConcurrency
ResourceDetectUtils.getResourceSizBySerial

fix design

In all the logic of newly opened threads, if KylinConfig is used, this method KylinConfig.getInstanceFromEnv() is not used. Unified is obtained by an external thread and passed to the place where it needs to be used

Attachments

Activity

People

Assignee:: Zhiting Guo

Reporter:: Zhiting Guo

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 18/Jul/23 02:30

Updated:: 23/Aug/23 08:45

Resolved:: 23/Aug/23 08:45