2021-10-20 23:23:38,704 INFO [pool-1-thread-1] application.SparkApplication : Executor task org.apache.kylin.engine.spark.job.CubeBuildJob with args : {"distMetaUrl":"kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta","submitter":"ADMIN","dataRangeEnd":"9223372036854775807","targetModel":"31115e9c-7baa-61e6-2997-378ff63a38ec","dataRangeStart":"0","project":"empTest","className":"org.apache.kylin.engine.spark.job.CubeBuildJob","segmentName":"FULL_BUILD","parentId":"b532af1a-abef-4d0f-8b29-450744d499fc","jobId":"b532af1a-abef-4d0f-8b29-450744d499fc","outputMetaUrl":"kylin_metadata@jdbc,url=jdbc:mysql://localhost:3306/kylin,username=root,password=******,maxActive=10,maxIdle=10","segmentId":"4cf9a91e-67c7-9a04-7fec-d2732f5353a4","cuboidsNum":"31","cubeName":"testCube","jobType":"BUILD","cubeId":"30c0a683-9424-2157-5b51-43991cf95e4e","segmentIds":"4cf9a91e-67c7-9a04-7fec-d2732f5353a4"} 2021-10-20 23:23:38,707 INFO [pool-1-thread-1] utils.MetaDumpUtil : Ready to load KylinConfig from uri: kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta 2021-10-20 23:23:38,793 INFO [pool-1-thread-1] common.KylinConfigBase : Kylin Config was updated with kylin.metadata.url.identifier : kylin_metadata 2021-10-20 23:23:38,796 INFO [pool-1-thread-1] common.KylinConfigBase : Kylin Config was updated with kylin.log.spark-executor-properties-file : /opt/kylin/conf/spark-executor-log4j.properties 2021-10-20 23:23:38,797 INFO [pool-1-thread-1] common.KylinConfigBase : Kylin Config was updated with kylin.source.provider.0 : org.apache.kylin.engine.spark.source.HiveSource 2021-10-20 23:23:38,801 INFO [pool-1-thread-1] application.SparkApplication : Start set spark conf automatically. 2021-10-20 23:23:39,707 DEBUG [pool-1-thread-1] util.HadoopUtil : Use provider:org.apache.kylin.common.storage.DefaultStorageProvider 2021-10-20 23:23:39,735 DEBUG [pool-1-thread-1] util.HadoopUtil : Use provider:org.apache.kylin.common.storage.DefaultStorageProvider 2021-10-20 23:23:39,794 INFO [pool-1-thread-1] job.CubeBuildJob : The maximum number of tasks required to run the job is 6.0 2021-10-20 23:23:39,795 INFO [pool-1-thread-1] job.CubeBuildJob : require cores: 2 2021-10-20 23:23:39,832 INFO [pool-1-thread-1] application.SparkApplication : Exist count distinct measure: false 2021-10-20 23:23:39,899 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stdout {"scheduler":{"schedulerInfo":{"type":"capacityScheduler","capacity":100.0,"usedCapacity":0.0,"maxCapacity":100.0,"queueName":"root","queues":{"queue":[{"type":"capacitySchedulerLeafQueueInfo","capacity":100.0,"usedCapacity":0.0,"maxCapacity":100.0,"absoluteCapacity":100.0,"absoluteMaxCapacity":100.0,"absoluteUsedCapacity":0.0,"numApplications":0,"queueName":"default","state":"RUNNING","resourcesUsed":{"memory":0,"vCores":0},"hideReservationQueues":false,"nodeLabels":["*"],"numActiveApplications":0,"numPendingApplications":0,"numContainers":0,"maxApplications":10000,"maxApplicationsPerUser":10000,"userLimit":100,"users":null,"userLimitFactor":1.0,"AMResourceLimit":{"memory":5120,"vCores":1},"usedAMResource":{"memory":0,"vCores":0},"userAMResourceLimit":{"memory":5120,"vCores":1},"preemptionDisabled":true}]}}}} 2021-10-20 23:23:39,900 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr % Total % Received % Xferd Average Speed Time Time Time Current 2021-10-20 23:23:39,900 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr Dload Upload Total Spent Left Speed 2021-10-20 23:23:39,900 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 2021-10-20 23:23:39,900 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 2021-10-20 23:23:39,900 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 100 821 0 821 0 0 86557 0 --:--:-- --:--:-- --:--:-- 91222 2021-10-20 23:23:39,901 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : Thread wait for executing command curl -k --negotiate -u : "http://henghe-031:8088/ws/v1/cluster/scheduler" 2021-10-20 23:23:39,920 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr % Total % Received % Xferd Average Speed Time Time Time Current 2021-10-20 23:23:39,920 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr Dload Upload Total Spent Left Speed 2021-10-20 23:23:39,921 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 2021-10-20 23:23:39,921 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed connect to henghe-030:8088; Connection refused 2021-10-20 23:23:39,921 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : Thread wait for executing command curl -k --negotiate -u : "http://henghe-030:8088/ws/v1/cluster/scheduler" 2021-10-20 23:23:39,950 INFO [pool-1-thread-1] parser.CapacitySchedulerParser : Queue available capacity: 1.0. 2021-10-20 23:23:39,950 INFO [pool-1-thread-1] parser.CapacitySchedulerParser : Current queue used memory is 0, seem available resource as infinite. 2021-10-20 23:23:39,951 INFO [pool-1-thread-1] parser.CapacitySchedulerParser : Cluster available capacity: 1.0. 2021-10-20 23:23:39,952 INFO [pool-1-thread-1] parser.CapacitySchedulerParser : Capacity actual available resource: AvailableResource(ResourceInfo(2147483647,2147483647),ResourceInfo(2147483647,2147483647)). 2021-10-20 23:23:39,994 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stdout {"scheduler":{"schedulerInfo":{"type":"capacityScheduler","capacity":100.0,"usedCapacity":0.0,"maxCapacity":100.0,"queueName":"root","queues":{"queue":[{"type":"capacitySchedulerLeafQueueInfo","capacity":100.0,"usedCapacity":0.0,"maxCapacity":100.0,"absoluteCapacity":100.0,"absoluteMaxCapacity":100.0,"absoluteUsedCapacity":0.0,"numApplications":0,"queueName":"default","state":"RUNNING","resourcesUsed":{"memory":0,"vCores":0},"hideReservationQueues":false,"nodeLabels":["*"],"numActiveApplications":0,"numPendingApplications":0,"numContainers":0,"maxApplications":10000,"maxApplicationsPerUser":10000,"userLimit":100,"users":null,"userLimitFactor":1.0,"AMResourceLimit":{"memory":5120,"vCores":1},"usedAMResource":{"memory":0,"vCores":0},"userAMResourceLimit":{"memory":5120,"vCores":1},"preemptionDisabled":true}]}}}} 2021-10-20 23:23:39,994 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr % Total % Received % Xferd Average Speed Time Time Time Current 2021-10-20 23:23:39,994 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr Dload Upload Total Spent Left Speed 2021-10-20 23:23:39,994 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 2021-10-20 23:23:39,995 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 2021-10-20 23:23:39,995 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 100 821 0 821 0 0 95188 0 --:--:-- --:--:-- --:--:-- 100k 2021-10-20 23:23:39,995 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : Thread wait for executing command curl -k --negotiate -u : "http://henghe-031:8088/ws/v1/cluster/scheduler" 2021-10-20 23:23:40,014 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr % Total % Received % Xferd Average Speed Time Time Time Current 2021-10-20 23:23:40,015 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr Dload Upload Total Spent Left Speed 2021-10-20 23:23:40,015 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 2021-10-20 23:23:40,015 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed connect to henghe-030:8088; Connection refused 2021-10-20 23:23:40,015 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : Thread wait for executing command curl -k --negotiate -u : "http://henghe-030:8088/ws/v1/cluster/scheduler" 2021-10-20 23:23:40,018 INFO [pool-1-thread-1] parser.CapacitySchedulerParser : Queue available capacity: 1.0. 2021-10-20 23:23:40,018 INFO [pool-1-thread-1] parser.CapacitySchedulerParser : Current queue used memory is 0, seem available resource as infinite. 2021-10-20 23:23:40,019 INFO [pool-1-thread-1] parser.CapacitySchedulerParser : Cluster available capacity: 1.0. 2021-10-20 23:23:40,019 INFO [pool-1-thread-1] parser.CapacitySchedulerParser : Capacity actual available resource: AvailableResource(ResourceInfo(2147483647,2147483647),ResourceInfo(2147483647,2147483647)). 2021-10-20 23:23:40,022 INFO [pool-1-thread-1] utils.SparkConfHelper : Auto set spark conf: spark.executor.memory = 1GB. 2021-10-20 23:23:40,022 INFO [pool-1-thread-1] utils.SparkConfHelper : Auto set spark conf: count_distinct = false. 2021-10-20 23:23:40,022 INFO [pool-1-thread-1] utils.SparkConfHelper : Auto set spark conf: spark.executor.cores = 1. 2021-10-20 23:23:40,022 INFO [pool-1-thread-1] utils.SparkConfHelper : Auto set spark conf: spark.executor.memoryOverhead = 512MB. 2021-10-20 23:23:40,022 INFO [pool-1-thread-1] utils.SparkConfHelper : Auto set spark conf: spark.executor.instances = 5. 2021-10-20 23:23:40,022 INFO [pool-1-thread-1] utils.SparkConfHelper : Auto set spark conf: spark.yarn.queue = default. 2021-10-20 23:23:40,022 INFO [pool-1-thread-1] utils.SparkConfHelper : Auto set spark conf: spark.sql.shuffle.partitions = 2. 2021-10-20 23:23:40,023 INFO [pool-1-thread-1] application.SparkApplication : Override user-defined spark conf, set spark.yarn.queue=default. 2021-10-20 23:23:40,023 INFO [pool-1-thread-1] application.SparkApplication : Override user-defined spark conf, set spark.history.fs.logDirectory=hdfs:///kylin/spark-history. 2021-10-20 23:23:40,023 INFO [pool-1-thread-1] application.SparkApplication : Override user-defined spark conf, set spark.driver.extraJavaOptions=-XX:+CrashOnOutOfMemoryError. 2021-10-20 23:23:40,023 INFO [pool-1-thread-1] application.SparkApplication : Override user-defined spark conf, set spark.master=yarn. 2021-10-20 23:23:40,023 INFO [pool-1-thread-1] application.SparkApplication : Override user-defined spark conf, set spark.executor.extraJavaOptions=-Dfile.encoding=UTF-8 -Dhdp.version=current -Dlog4j.configuration=spark-executor-log4j.properties -Dlog4j.debug -Dkylin.hdfs.working.dir=hdfs://master/kylin/kylin_metadata/ -Dkylin.metadata.identifier=kylin_metadata -Dkylin.spark.category=job -Dkylin.spark.project=empTest -Dkylin.spark.identifier=b532af1a-abef-4d0f-8b29-450744d499fc -Dkylin.spark.jobName=b532af1a-abef-4d0f-8b29-450744d499fc-01 -Duser.timezone=America/New_York. 2021-10-20 23:23:40,023 INFO [pool-1-thread-1] application.SparkApplication : Override user-defined spark conf, set spark.hadoop.yarn.timeline-service.enabled=false. 2021-10-20 23:23:40,023 INFO [pool-1-thread-1] application.SparkApplication : Override user-defined spark conf, set spark.eventLog.enabled=true. 2021-10-20 23:23:40,023 INFO [pool-1-thread-1] application.SparkApplication : Override user-defined spark conf, set spark.eventLog.dir=hdfs:///kylin/spark-history. 2021-10-20 23:23:40,023 INFO [pool-1-thread-1] application.SparkApplication : Override user-defined spark conf, set spark.submit.deployMode=client. 2021-10-20 23:23:40,040 INFO [pool-1-thread-1] util.TimeZoneUtils : System timezone set to America/New_York, TimeZoneId: America/New_York. 2021-10-20 23:23:40,040 INFO [pool-1-thread-1] application.SparkApplication : Sleep for random seconds to avoid submitting too many spark job at the same time. 2021-10-20 23:23:52,285 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stdout {"scheduler":{"schedulerInfo":{"type":"capacityScheduler","capacity":100.0,"usedCapacity":0.0,"maxCapacity":100.0,"queueName":"root","queues":{"queue":[{"type":"capacitySchedulerLeafQueueInfo","capacity":100.0,"usedCapacity":0.0,"maxCapacity":100.0,"absoluteCapacity":100.0,"absoluteMaxCapacity":100.0,"absoluteUsedCapacity":0.0,"numApplications":0,"queueName":"default","state":"RUNNING","resourcesUsed":{"memory":0,"vCores":0},"hideReservationQueues":false,"nodeLabels":["*"],"numActiveApplications":0,"numPendingApplications":0,"numContainers":0,"maxApplications":10000,"maxApplicationsPerUser":10000,"userLimit":100,"users":null,"userLimitFactor":1.0,"AMResourceLimit":{"memory":5120,"vCores":1},"usedAMResource":{"memory":0,"vCores":0},"userAMResourceLimit":{"memory":5120,"vCores":1},"preemptionDisabled":true}]}}}} 2021-10-20 23:23:52,285 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr % Total % Received % Xferd Average Speed Time Time Time Current 2021-10-20 23:23:52,285 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr Dload Upload Total Spent Left Speed 2021-10-20 23:23:52,285 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 2021-10-20 23:23:52,285 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 2021-10-20 23:23:52,285 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 100 821 0 821 0 0 91639 0 --:--:-- --:--:-- --:--:-- 100k 2021-10-20 23:23:52,285 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : Thread wait for executing command curl -k --negotiate -u : "http://henghe-031:8088/ws/v1/cluster/scheduler" 2021-10-20 23:23:52,304 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr % Total % Received % Xferd Average Speed Time Time Time Current 2021-10-20 23:23:52,305 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr Dload Upload Total Spent Left Speed 2021-10-20 23:23:52,305 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 2021-10-20 23:23:52,305 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed connect to henghe-030:8088; Connection refused 2021-10-20 23:23:52,306 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : Thread wait for executing command curl -k --negotiate -u : "http://henghe-030:8088/ws/v1/cluster/scheduler" 2021-10-20 23:23:52,310 INFO [pool-1-thread-1] parser.CapacitySchedulerParser : Queue available capacity: 1.0. 2021-10-20 23:23:52,310 INFO [pool-1-thread-1] parser.CapacitySchedulerParser : Current queue used memory is 0, seem available resource as infinite. 2021-10-20 23:23:52,310 INFO [pool-1-thread-1] parser.CapacitySchedulerParser : Cluster available capacity: 1.0. 2021-10-20 23:23:52,311 INFO [pool-1-thread-1] parser.CapacitySchedulerParser : Capacity actual available resource: AvailableResource(ResourceInfo(2147483647,2147483647),ResourceInfo(2147483647,2147483647)). 2021-10-20 23:23:52,947 INFO [pool-1-thread-1] util.log : Logging initialized @16384ms 2021-10-20 23:23:53,011 INFO [pool-1-thread-1] server.Server : jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown 2021-10-20 23:23:53,027 INFO [pool-1-thread-1] server.Server : Started @16465ms 2021-10-20 23:23:53,046 INFO [pool-1-thread-1] server.AbstractConnector : Started ServerConnector@32066735{HTTP/1.1,[http/1.1]}{0.0.0.0:4040} 2021-10-20 23:23:53,071 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@2e0a089c{/jobs,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,072 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@491023{/jobs/json,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,072 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@4336d985{/jobs/job,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,073 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@4a1930af{/jobs/job/json,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,073 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@299f9c7{/stages,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,074 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@7796f2e0{/stages/json,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,074 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@33cb23ee{/stages/stage,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,075 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@3d0f9dd7{/stages/stage/json,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,075 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@57310dcb{/stages/pool,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,075 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@4079b93c{/stages/pool/json,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,076 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@4e32754e{/storage,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,076 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@f12b87a{/storage/json,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,077 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@5092a306{/storage/rdd,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,077 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@5d2f4071{/storage/rdd/json,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,077 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@6ad69f{/environment,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,077 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@1aa17f80{/environment/json,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,078 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@5fee7541{/executors,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,078 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@78e59a05{/executors/json,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,078 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@2eed65a8{/executors/threadDump,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,079 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@3f21528a{/executors/threadDump/json,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,087 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@13496c52{/static,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,087 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@7c44879a{/,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,088 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@4ef7fc4d{/api,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,089 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@53dd90ef{/jobs/job/kill,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,089 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@4d31ef3d{/stages/stage/kill,null,AVAILABLE,@Spark} 2021-10-20 23:23:53,378 WARN [pool-1-thread-1] yarn.Client : Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. 2021-10-20 23:23:56,695 INFO [pool-1-thread-1] impl.YarnClientImpl : Submitted application application_1634613529580_0039 2021-10-20 23:24:01,922 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@3cc3b96b{/metrics/json,null,AVAILABLE,@Spark} 2021-10-20 23:24:06,654 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@68702683{/SQL,null,AVAILABLE,@Spark} 2021-10-20 23:24:06,655 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@18aade82{/SQL/json,null,AVAILABLE,@Spark} 2021-10-20 23:24:06,656 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@65f53887{/SQL/execution,null,AVAILABLE,@Spark} 2021-10-20 23:24:06,656 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@2b7af342{/SQL/execution/json,null,AVAILABLE,@Spark} 2021-10-20 23:24:06,658 INFO [pool-1-thread-1] handler.ContextHandler : Started o.s.j.s.ServletContextHandler@61dbae56{/static/sql,null,AVAILABLE,@Spark} 2021-10-20 23:24:07,108 INFO [pool-1-thread-1] job.CubeBuildJob : Start building cube job for 4cf9a91e-67c7-9a04-7fec-d2732f5353a4 ... 2021-10-20 23:24:07,117 INFO [pool-1-thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.cube.CubeManager 2021-10-20 23:24:07,131 INFO [pool-1-thread-1] cube.CubeManager : Initializing CubeManager with config kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta 2021-10-20 23:24:07,133 INFO [pool-1-thread-1] persistence.ResourceStore : Using metadata url kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta for resource store 2021-10-20 23:24:07,152 INFO [pool-1-thread-1] persistence.HDFSResourceStore : hdfs meta path : hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta 2021-10-20 23:24:07,154 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Reloading CubeInstance from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/cube 2021-10-20 23:24:07,245 INFO [pool-1-thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.cube.CubeDescManager 2021-10-20 23:24:07,245 INFO [pool-1-thread-1] cube.CubeDescManager : Initializing CubeDescManager with config kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta 2021-10-20 23:24:07,245 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Reloading CubeDesc from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/cube_desc 2021-10-20 23:24:07,293 INFO [pool-1-thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.metadata.project.ProjectManager 2021-10-20 23:24:07,294 INFO [pool-1-thread-1] project.ProjectManager : Initializing ProjectManager with metadata url kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta 2021-10-20 23:24:07,295 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Reloading ProjectInstance from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/project 2021-10-20 23:24:07,308 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Loaded 1 ProjectInstance(s) out of 1 resource with 0 errors 2021-10-20 23:24:07,310 INFO [pool-1-thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.metadata.cachesync.Broadcaster 2021-10-20 23:24:07,312 DEBUG [pool-1-thread-1] cachesync.Broadcaster : 1 nodes in the cluster: [localhost:7070] 2021-10-20 23:24:07,315 INFO [pool-1-thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.metadata.model.DataModelManager 2021-10-20 23:24:07,319 INFO [pool-1-thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.metadata.TableMetadataManager 2021-10-20 23:24:07,320 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Reloading TableDesc from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/table 2021-10-20 23:24:07,346 DEBUG [pool-1-thread-1] measure.MeasureTypeFactory : registering COUNT_DISTINCT(hllc), class org.apache.kylin.measure.hllc.HLLCMeasureType$Factory 2021-10-20 23:24:07,349 DEBUG [pool-1-thread-1] measure.MeasureTypeFactory : registering COUNT_DISTINCT(bitmap), class org.apache.kylin.measure.bitmap.BitmapMeasureType$Factory 2021-10-20 23:24:07,354 DEBUG [pool-1-thread-1] measure.MeasureTypeFactory : registering TOP_N(topn), class org.apache.kylin.measure.topn.TopNMeasureType$Factory 2021-10-20 23:24:07,356 DEBUG [pool-1-thread-1] measure.MeasureTypeFactory : registering RAW(raw), class org.apache.kylin.measure.raw.RawMeasureType$Factory 2021-10-20 23:24:07,357 DEBUG [pool-1-thread-1] measure.MeasureTypeFactory : registering EXTENDED_COLUMN(extendedcolumn), class org.apache.kylin.measure.extendedcolumn.ExtendedColumnMeasureType$Factory 2021-10-20 23:24:07,358 DEBUG [pool-1-thread-1] measure.MeasureTypeFactory : registering PERCENTILE_APPROX(percentile), class org.apache.kylin.measure.percentile.PercentileMeasureType$Factory 2021-10-20 23:24:07,358 DEBUG [pool-1-thread-1] measure.MeasureTypeFactory : registering COUNT_DISTINCT(dim_dc), class org.apache.kylin.measure.dim.DimCountDistinctMeasureType$Factory 2021-10-20 23:24:07,360 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Loaded 2 TableDesc(s) out of 2 resource with 0 errors 2021-10-20 23:24:07,360 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Reloading TableExtDesc from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/table_exd 2021-10-20 23:24:07,380 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Loaded 2 TableExtDesc(s) out of 2 resource with 0 errors 2021-10-20 23:24:07,381 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Reloading ExternalFilterDesc from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/ext_filter 2021-10-20 23:24:07,381 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Loaded 0 ExternalFilterDesc(s) out of 0 resource with 0 errors 2021-10-20 23:24:07,382 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Reloading DataModelDesc from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/model_desc 2021-10-20 23:24:07,408 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Loaded 1 DataModelDesc(s) out of 1 resource with 0 errors 2021-10-20 23:24:07,420 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Loaded 1 CubeDesc(s) out of 1 resource with 0 errors 2021-10-20 23:24:07,421 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Loaded 1 CubeInstance(s) out of 1 resource with 0 errors 2021-10-20 23:24:07,463 INFO [pool-1-thread-1] job.CubeBuildJob : There are 31 cuboids to be built in segment FULL_BUILD. 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 30 has row keys: 4, 3, 2, 1 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 14 has row keys: 3, 2, 1 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 17 has row keys: 4, 0 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 28 has row keys: 4, 3, 2 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 27 has row keys: 4, 3, 1, 0 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 5 has row keys: 2, 0 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 19 has row keys: 4, 1, 0 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 6 has row keys: 2, 1 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 8 has row keys: 3 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 16 has row keys: 4 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 22 has row keys: 4, 2, 1 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 29 has row keys: 4, 3, 2, 0 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 31 has row keys: 4, 3, 2, 1, 0 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 20 has row keys: 4, 2 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 25 has row keys: 4, 3, 0 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 13 has row keys: 3, 2, 0 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 18 has row keys: 4, 1 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 21 has row keys: 4, 2, 0 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 4 has row keys: 2 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 7 has row keys: 2, 1, 0 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 10 has row keys: 3, 1 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 9 has row keys: 3, 0 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 1 has row keys: 0 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 11 has row keys: 3, 1, 0 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 15 has row keys: 3, 2, 1, 0 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 2 has row keys: 1 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 12 has row keys: 3, 2 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 23 has row keys: 4, 2, 1, 0 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 3 has row keys: 1, 0 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 24 has row keys: 4, 3 2021-10-20 23:24:07,464 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 26 has row keys: 4, 3, 1 2021-10-20 23:24:08,784 INFO [pool-1-thread-1] metastore.HiveMetaStore : 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore 2021-10-20 23:24:08,822 INFO [pool-1-thread-1] metastore.ObjectStore : ObjectStore, initialize called 2021-10-20 23:24:08,985 INFO [pool-1-thread-1] DataNucleus.Persistence : Property hive.metastore.integral.jdo.pushdown unknown - will be ignored 2021-10-20 23:24:08,985 INFO [pool-1-thread-1] DataNucleus.Persistence : Property datanucleus.cache.level2 unknown - will be ignored 2021-10-20 23:24:08,985 INFO [pool-1-thread-1] DataNucleus.Persistence : Property datanucleus.schema.autoCreateAll unknown - will be ignored 2021-10-20 23:24:09,917 INFO [pool-1-thread-1] metastore.ObjectStore : Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order" 2021-10-20 23:24:10,850 INFO [pool-1-thread-1] DataNucleus.Datastore : The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 2021-10-20 23:24:10,851 INFO [pool-1-thread-1] DataNucleus.Datastore : The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 2021-10-20 23:24:10,947 INFO [pool-1-thread-1] DataNucleus.Datastore : The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 2021-10-20 23:24:10,947 INFO [pool-1-thread-1] DataNucleus.Datastore : The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 2021-10-20 23:24:11,020 INFO [pool-1-thread-1] DataNucleus.Query : Reading in results for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is closing 2021-10-20 23:24:11,023 INFO [pool-1-thread-1] metastore.MetaStoreDirectSql : Using direct SQL, underlying DB is MYSQL 2021-10-20 23:24:11,027 INFO [pool-1-thread-1] metastore.ObjectStore : Initialized ObjectStore 2021-10-20 23:24:11,312 INFO [pool-1-thread-1] metastore.HiveMetaStore : Added admin role in metastore 2021-10-20 23:24:11,316 INFO [pool-1-thread-1] metastore.HiveMetaStore : Added public role in metastore 2021-10-20 23:24:11,372 INFO [pool-1-thread-1] metastore.HiveMetaStore : No user is added in admin role, since config is empty 2021-10-20 23:24:11,487 INFO [pool-1-thread-1] metastore.HiveMetaStore : 0: get_all_databases 2021-10-20 23:24:11,489 INFO [pool-1-thread-1] HiveMetaStore.audit : ugi=root ip=unknown-ip-addr cmd=get_all_databases 2021-10-20 23:24:11,506 INFO [pool-1-thread-1] metastore.HiveMetaStore : 0: get_functions: db=default pat=* 2021-10-20 23:24:11,506 INFO [pool-1-thread-1] HiveMetaStore.audit : ugi=root ip=unknown-ip-addr cmd=get_functions: db=default pat=* 2021-10-20 23:24:11,508 INFO [pool-1-thread-1] DataNucleus.Datastore : The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table. 2021-10-20 23:24:11,557 INFO [pool-1-thread-1] session.SessionState : Created local directory: /tmp/040e7260-5549-4fd8-9cad-0e88844c960d_resources 2021-10-20 23:24:11,560 INFO [pool-1-thread-1] session.SessionState : Created HDFS directory: /tmp/hive/root/040e7260-5549-4fd8-9cad-0e88844c960d 2021-10-20 23:24:11,561 INFO [pool-1-thread-1] session.SessionState : Created local directory: /tmp/root/040e7260-5549-4fd8-9cad-0e88844c960d 2021-10-20 23:24:11,564 INFO [pool-1-thread-1] session.SessionState : Created HDFS directory: /tmp/hive/root/040e7260-5549-4fd8-9cad-0e88844c960d/_tmp_space.db 2021-10-20 23:24:11,578 INFO [pool-1-thread-1] metastore.HiveMetaStore : 0: get_database: default 2021-10-20 23:24:11,578 INFO [pool-1-thread-1] HiveMetaStore.audit : ugi=root ip=unknown-ip-addr cmd=get_database: default 2021-10-20 23:24:11,592 INFO [pool-1-thread-1] metastore.HiveMetaStore : 0: get_database: global_temp 2021-10-20 23:24:11,593 INFO [pool-1-thread-1] HiveMetaStore.audit : ugi=root ip=unknown-ip-addr cmd=get_database: global_temp 2021-10-20 23:24:11,596 WARN [pool-1-thread-1] metastore.ObjectStore : Failed to get database global_temp, returning NoSuchObjectException 2021-10-20 23:24:11,600 INFO [pool-1-thread-1] metastore.HiveMetaStore : 0: get_database: default 2021-10-20 23:24:11,601 INFO [pool-1-thread-1] HiveMetaStore.audit : ugi=root ip=unknown-ip-addr cmd=get_database: default 2021-10-20 23:24:11,605 INFO [pool-1-thread-1] metastore.HiveMetaStore : 0: get_table : db=default tbl=dept 2021-10-20 23:24:11,605 INFO [pool-1-thread-1] HiveMetaStore.audit : ugi=root ip=unknown-ip-addr cmd=get_table : db=default tbl=dept 2021-10-20 23:24:11,698 INFO [pool-1-thread-1] metastore.HiveMetaStore : 0: get_table : db=default tbl=dept 2021-10-20 23:24:11,698 INFO [pool-1-thread-1] HiveMetaStore.audit : ugi=root ip=unknown-ip-addr cmd=get_table : db=default tbl=dept 2021-10-20 23:24:12,074 INFO [pool-1-thread-1] source.HiveSource : Source data sql is: select DEPTNO,DNAME,LOC from DEFAULT.DEPT 2021-10-20 23:24:12,075 INFO [pool-1-thread-1] source.HiveSource : Kylin schema root |-- DEPTNO: integer (nullable = true) |-- DNAME: string (nullable = true) |-- LOC: integer (nullable = true) 2021-10-20 23:24:13,478 INFO [pool-1-thread-1] mapred.FileInputFormat : Total input paths to process : 3 2021-10-20 23:24:15,633 INFO [pool-1-thread-1] mapred.FileInputFormat : Total input paths to process : 3 2021-10-20 23:24:18,488 INFO [pool-1-thread-1] job.CubeBuildJob : Building job takes 11380 ms 2021-10-20 23:24:18,489 ERROR [pool-1-thread-1] application.SparkApplication : The spark job execute failed! java.lang.IllegalStateException: Failed to build lookup table DEPT snapshot for Dup key found, key= DEPTNO at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:198) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:190) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder.checkDupKey(CubeSnapshotBuilder.scala:189) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideFlatTableSource(ParentSourceChooser.scala:83) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:71) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:66) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideSources(ParentSourceChooser.scala:66) at org.apache.kylin.engine.spark.job.CubeBuildJob.doExecute(CubeBuildJob.java:178) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:304) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:93) at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 2021-10-20 23:24:18,490 ERROR [pool-1-thread-1] application.JobMonitor : Job failed the 1 times. java.lang.RuntimeException: Error execute org.apache.kylin.engine.spark.job.CubeBuildJob at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:96) at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IllegalStateException: Failed to build lookup table DEPT snapshot for Dup key found, key= DEPTNO at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:198) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:190) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder.checkDupKey(CubeSnapshotBuilder.scala:189) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideFlatTableSource(ParentSourceChooser.scala:83) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:71) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:66) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideSources(ParentSourceChooser.scala:66) at org.apache.kylin.engine.spark.job.CubeBuildJob.doExecute(CubeBuildJob.java:178) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:304) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:93) ... 4 more 2021-10-20 23:24:18,520 INFO [pool-1-thread-1] cluster.YarnInfoFetcher : Cluster maximum resource allocation ResourceInfo(9000,8) 2021-10-20 23:24:18,547 INFO [pool-1-thread-1] application.SparkApplication : Executor task org.apache.kylin.engine.spark.job.CubeBuildJob with args : {"distMetaUrl":"kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta","submitter":"ADMIN","dataRangeEnd":"9223372036854775807","targetModel":"31115e9c-7baa-61e6-2997-378ff63a38ec","dataRangeStart":"0","project":"empTest","className":"org.apache.kylin.engine.spark.job.CubeBuildJob","segmentName":"FULL_BUILD","parentId":"b532af1a-abef-4d0f-8b29-450744d499fc","jobId":"b532af1a-abef-4d0f-8b29-450744d499fc","outputMetaUrl":"kylin_metadata@jdbc,url=jdbc:mysql://localhost:3306/kylin,username=root,password=******,maxActive=10,maxIdle=10","segmentId":"4cf9a91e-67c7-9a04-7fec-d2732f5353a4","cuboidsNum":"31","cubeName":"testCube","jobType":"BUILD","cubeId":"30c0a683-9424-2157-5b51-43991cf95e4e","segmentIds":"4cf9a91e-67c7-9a04-7fec-d2732f5353a4"} 2021-10-20 23:24:18,547 INFO [pool-1-thread-1] utils.MetaDumpUtil : Ready to load KylinConfig from uri: kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta 2021-10-20 23:24:18,573 INFO [pool-1-thread-1] common.KylinConfigBase : Kylin Config was updated with kylin.metadata.url.identifier : kylin_metadata 2021-10-20 23:24:18,573 INFO [pool-1-thread-1] common.KylinConfigBase : Kylin Config was updated with kylin.log.spark-executor-properties-file : /opt/kylin/conf/spark-executor-log4j.properties 2021-10-20 23:24:18,573 INFO [pool-1-thread-1] common.KylinConfigBase : Kylin Config was updated with kylin.source.provider.0 : org.apache.kylin.engine.spark.source.HiveSource 2021-10-20 23:24:18,573 INFO [pool-1-thread-1] util.TimeZoneUtils : System timezone set to America/New_York, TimeZoneId: America/New_York. 2021-10-20 23:24:18,573 INFO [pool-1-thread-1] application.SparkApplication : Sleep for random seconds to avoid submitting too many spark job at the same time. 2021-10-20 23:24:39,297 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stdout {"scheduler":{"schedulerInfo":{"type":"capacityScheduler","capacity":100.0,"usedCapacity":25.199106,"maxCapacity":100.0,"queueName":"root","queues":{"queue":[{"type":"capacitySchedulerLeafQueueInfo","capacity":100.0,"usedCapacity":25.199106,"maxCapacity":100.0,"absoluteCapacity":100.0,"absoluteMaxCapacity":100.0,"absoluteUsedCapacity":25.199106,"numApplications":1,"queueName":"default","state":"RUNNING","resourcesUsed":{"memory":11264,"vCores":6},"hideReservationQueues":false,"nodeLabels":["*"],"numActiveApplications":1,"numPendingApplications":0,"numContainers":6,"maxApplications":10000,"maxApplicationsPerUser":10000,"userLimit":100,"users":{"user":[{"username":"root","resourcesUsed":{"memory":11264,"vCores":6},"numPendingApplications":0,"numActiveApplications":1,"AMResourceUsed":{"memory":1024,"vCores":1},"userResourceLimit":{"memory":45056,"vCores":1}}]},"userLimitFactor":1.0,"AMResourceLimit":{"memory":5120,"vCores":1},"usedAMResource":{"memory":1024,"vCores":1},"userAMResourceLimit":{"memory":5120,"vCores":1},"preemptionDisabled":true}]}}}} 2021-10-20 23:24:39,297 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr % Total % Received % Xferd Average Speed Time Time Time Current 2021-10-20 23:24:39,297 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr Dload Upload Total Spent Left Speed 2021-10-20 23:24:39,298 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 2021-10-20 23:24:39,298 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 2021-10-20 23:24:39,298 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 100 1061 0 1061 0 0 103k 0 --:--:-- --:--:-- --:--:-- 115k 2021-10-20 23:24:39,298 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : Thread wait for executing command curl -k --negotiate -u : "http://henghe-031:8088/ws/v1/cluster/scheduler" 2021-10-20 23:24:39,314 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr % Total % Received % Xferd Average Speed Time Time Time Current 2021-10-20 23:24:39,314 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr Dload Upload Total Spent Left Speed 2021-10-20 23:24:39,314 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 2021-10-20 23:24:39,314 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed connect to henghe-030:8088; Connection refused 2021-10-20 23:24:39,314 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : Thread wait for executing command curl -k --negotiate -u : "http://henghe-030:8088/ws/v1/cluster/scheduler" 2021-10-20 23:24:39,320 INFO [pool-1-thread-1] parser.CapacitySchedulerParser : Queue available capacity: 0.74800894. 2021-10-20 23:24:39,321 INFO [pool-1-thread-1] parser.CapacitySchedulerParser : Estimate total cluster resource is ResourceInfo(44699,2147483647). 2021-10-20 23:24:39,321 INFO [pool-1-thread-1] parser.CapacitySchedulerParser : Cluster available capacity: 0.74800894. 2021-10-20 23:24:39,321 INFO [pool-1-thread-1] parser.CapacitySchedulerParser : Capacity actual available resource: AvailableResource(ResourceInfo(33435,1606336966),ResourceInfo(44699,2147483647)). 2021-10-20 23:24:39,325 WARN [pool-1-thread-1] sql.SparkSession$Builder : Using an existing SparkSession; the static sql configurations will not take effect. 2021-10-20 23:24:39,325 WARN [pool-1-thread-1] sql.SparkSession$Builder : Using an existing SparkSession; some spark core configurations may not take effect. 2021-10-20 23:24:39,370 INFO [pool-1-thread-1] job.CubeBuildJob : Start building cube job for 4cf9a91e-67c7-9a04-7fec-d2732f5353a4 ... 2021-10-20 23:24:39,370 INFO [pool-1-thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.cube.CubeManager 2021-10-20 23:24:39,370 INFO [pool-1-thread-1] cube.CubeManager : Initializing CubeManager with config kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta 2021-10-20 23:24:39,370 INFO [pool-1-thread-1] persistence.ResourceStore : Using metadata url kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta for resource store 2021-10-20 23:24:39,387 INFO [pool-1-thread-1] persistence.HDFSResourceStore : hdfs meta path : hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta 2021-10-20 23:24:39,388 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Reloading CubeInstance from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/cube 2021-10-20 23:24:39,392 INFO [pool-1-thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.cube.CubeDescManager 2021-10-20 23:24:39,392 INFO [pool-1-thread-1] cube.CubeDescManager : Initializing CubeDescManager with config kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta 2021-10-20 23:24:39,393 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Reloading CubeDesc from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/cube_desc 2021-10-20 23:24:39,396 INFO [pool-1-thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.metadata.project.ProjectManager 2021-10-20 23:24:39,396 INFO [pool-1-thread-1] project.ProjectManager : Initializing ProjectManager with metadata url kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta 2021-10-20 23:24:39,396 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Reloading ProjectInstance from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/project 2021-10-20 23:24:39,437 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Loaded 1 ProjectInstance(s) out of 1 resource with 0 errors 2021-10-20 23:24:39,437 INFO [pool-1-thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.metadata.cachesync.Broadcaster 2021-10-20 23:24:39,437 DEBUG [pool-1-thread-1] cachesync.Broadcaster : 1 nodes in the cluster: [localhost:7070] 2021-10-20 23:24:39,438 INFO [pool-1-thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.metadata.model.DataModelManager 2021-10-20 23:24:39,438 INFO [pool-1-thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.metadata.TableMetadataManager 2021-10-20 23:24:39,438 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Reloading TableDesc from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/table 2021-10-20 23:24:39,443 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Loaded 2 TableDesc(s) out of 2 resource with 0 errors 2021-10-20 23:24:39,443 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Reloading TableExtDesc from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/table_exd 2021-10-20 23:24:39,447 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Loaded 2 TableExtDesc(s) out of 2 resource with 0 errors 2021-10-20 23:24:39,447 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Reloading ExternalFilterDesc from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/ext_filter 2021-10-20 23:24:39,447 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Loaded 0 ExternalFilterDesc(s) out of 0 resource with 0 errors 2021-10-20 23:24:39,448 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Reloading DataModelDesc from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/model_desc 2021-10-20 23:24:39,451 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Loaded 1 DataModelDesc(s) out of 1 resource with 0 errors 2021-10-20 23:24:39,451 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Loaded 1 CubeDesc(s) out of 1 resource with 0 errors 2021-10-20 23:24:39,451 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Loaded 1 CubeInstance(s) out of 1 resource with 0 errors 2021-10-20 23:24:39,473 INFO [pool-1-thread-1] job.CubeBuildJob : There are 31 cuboids to be built in segment FULL_BUILD. 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 25 has row keys: 4, 3, 0 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 24 has row keys: 4, 3 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 27 has row keys: 4, 3, 1, 0 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 3 has row keys: 1, 0 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 22 has row keys: 4, 2, 1 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 13 has row keys: 3, 2, 0 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 26 has row keys: 4, 3, 1 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 16 has row keys: 4 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 4 has row keys: 2 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 15 has row keys: 3, 2, 1, 0 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 30 has row keys: 4, 3, 2, 1 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 23 has row keys: 4, 2, 1, 0 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 31 has row keys: 4, 3, 2, 1, 0 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 1 has row keys: 0 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 21 has row keys: 4, 2, 0 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 20 has row keys: 4, 2 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 19 has row keys: 4, 1, 0 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 2 has row keys: 1 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 14 has row keys: 3, 2, 1 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 29 has row keys: 4, 3, 2, 0 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 18 has row keys: 4, 1 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 10 has row keys: 3, 1 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 12 has row keys: 3, 2 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 28 has row keys: 4, 3, 2 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 11 has row keys: 3, 1, 0 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 5 has row keys: 2, 0 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 7 has row keys: 2, 1, 0 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 9 has row keys: 3, 0 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 17 has row keys: 4, 0 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 8 has row keys: 3 2021-10-20 23:24:39,473 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 6 has row keys: 2, 1 2021-10-20 23:24:39,476 INFO [pool-1-thread-1] metastore.HiveMetaStore : 0: get_database: default 2021-10-20 23:24:39,476 INFO [pool-1-thread-1] HiveMetaStore.audit : ugi=root ip=unknown-ip-addr cmd=get_database: default 2021-10-20 23:24:39,483 INFO [pool-1-thread-1] metastore.HiveMetaStore : 0: get_table : db=default tbl=dept 2021-10-20 23:24:39,483 INFO [pool-1-thread-1] HiveMetaStore.audit : ugi=root ip=unknown-ip-addr cmd=get_table : db=default tbl=dept 2021-10-20 23:24:39,502 INFO [pool-1-thread-1] metastore.HiveMetaStore : 0: get_table : db=default tbl=dept 2021-10-20 23:24:39,502 INFO [pool-1-thread-1] HiveMetaStore.audit : ugi=root ip=unknown-ip-addr cmd=get_table : db=default tbl=dept 2021-10-20 23:24:39,524 INFO [pool-1-thread-1] source.HiveSource : Source data sql is: select DEPTNO,DNAME,LOC from DEFAULT.DEPT 2021-10-20 23:24:39,524 INFO [pool-1-thread-1] source.HiveSource : Kylin schema root |-- DEPTNO: integer (nullable = true) |-- DNAME: string (nullable = true) |-- LOC: integer (nullable = true) 2021-10-20 23:24:39,580 INFO [pool-1-thread-1] mapred.FileInputFormat : Total input paths to process : 3 2021-10-20 23:24:39,819 INFO [pool-1-thread-1] mapred.FileInputFormat : Total input paths to process : 3 2021-10-20 23:24:40,697 INFO [pool-1-thread-1] job.CubeBuildJob : Building job takes 1327 ms 2021-10-20 23:24:40,697 ERROR [pool-1-thread-1] application.SparkApplication : The spark job execute failed! java.lang.IllegalStateException: Failed to build lookup table DEPT snapshot for Dup key found, key= DEPTNO at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:198) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:190) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder.checkDupKey(CubeSnapshotBuilder.scala:189) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideFlatTableSource(ParentSourceChooser.scala:83) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:71) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:66) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideSources(ParentSourceChooser.scala:66) at org.apache.kylin.engine.spark.job.CubeBuildJob.doExecute(CubeBuildJob.java:178) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:304) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:93) at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 2021-10-20 23:24:40,698 ERROR [pool-1-thread-1] application.JobMonitor : Job failed the 2 times. java.lang.RuntimeException: Error execute org.apache.kylin.engine.spark.job.CubeBuildJob at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:96) at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IllegalStateException: Failed to build lookup table DEPT snapshot for Dup key found, key= DEPTNO at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:198) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:190) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder.checkDupKey(CubeSnapshotBuilder.scala:189) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideFlatTableSource(ParentSourceChooser.scala:83) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:71) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:66) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideSources(ParentSourceChooser.scala:66) at org.apache.kylin.engine.spark.job.CubeBuildJob.doExecute(CubeBuildJob.java:178) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:304) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:93) ... 4 more 2021-10-20 23:24:40,713 INFO [pool-1-thread-1] application.SparkApplication : Executor task org.apache.kylin.engine.spark.job.CubeBuildJob with args : {"distMetaUrl":"kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta","submitter":"ADMIN","dataRangeEnd":"9223372036854775807","targetModel":"31115e9c-7baa-61e6-2997-378ff63a38ec","dataRangeStart":"0","project":"empTest","className":"org.apache.kylin.engine.spark.job.CubeBuildJob","segmentName":"FULL_BUILD","parentId":"b532af1a-abef-4d0f-8b29-450744d499fc","jobId":"b532af1a-abef-4d0f-8b29-450744d499fc","outputMetaUrl":"kylin_metadata@jdbc,url=jdbc:mysql://localhost:3306/kylin,username=root,password=******,maxActive=10,maxIdle=10","segmentId":"4cf9a91e-67c7-9a04-7fec-d2732f5353a4","cuboidsNum":"31","cubeName":"testCube","jobType":"BUILD","cubeId":"30c0a683-9424-2157-5b51-43991cf95e4e","segmentIds":"4cf9a91e-67c7-9a04-7fec-d2732f5353a4"} 2021-10-20 23:24:40,713 INFO [pool-1-thread-1] utils.MetaDumpUtil : Ready to load KylinConfig from uri: kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta 2021-10-20 23:24:40,729 INFO [pool-1-thread-1] common.KylinConfigBase : Kylin Config was updated with kylin.metadata.url.identifier : kylin_metadata 2021-10-20 23:24:40,729 INFO [pool-1-thread-1] common.KylinConfigBase : Kylin Config was updated with kylin.log.spark-executor-properties-file : /opt/kylin/conf/spark-executor-log4j.properties 2021-10-20 23:24:40,729 INFO [pool-1-thread-1] common.KylinConfigBase : Kylin Config was updated with kylin.source.provider.0 : org.apache.kylin.engine.spark.source.HiveSource 2021-10-20 23:24:40,729 INFO [pool-1-thread-1] util.TimeZoneUtils : System timezone set to America/New_York, TimeZoneId: America/New_York. 2021-10-20 23:24:40,729 INFO [pool-1-thread-1] application.SparkApplication : Sleep for random seconds to avoid submitting too many spark job at the same time. 2021-10-20 23:24:59,559 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stdout {"scheduler":{"schedulerInfo":{"type":"capacityScheduler","capacity":100.0,"usedCapacity":25.199106,"maxCapacity":100.0,"queueName":"root","queues":{"queue":[{"type":"capacitySchedulerLeafQueueInfo","capacity":100.0,"usedCapacity":25.199106,"maxCapacity":100.0,"absoluteCapacity":100.0,"absoluteMaxCapacity":100.0,"absoluteUsedCapacity":25.199106,"numApplications":1,"queueName":"default","state":"RUNNING","resourcesUsed":{"memory":11264,"vCores":6},"hideReservationQueues":false,"nodeLabels":["*"],"numActiveApplications":1,"numPendingApplications":0,"numContainers":6,"maxApplications":10000,"maxApplicationsPerUser":10000,"userLimit":100,"users":{"user":[{"username":"root","resourcesUsed":{"memory":11264,"vCores":6},"numPendingApplications":0,"numActiveApplications":1,"AMResourceUsed":{"memory":1024,"vCores":1},"userResourceLimit":{"memory":45056,"vCores":1}}]},"userLimitFactor":1.0,"AMResourceLimit":{"memory":5120,"vCores":1},"usedAMResource":{"memory":1024,"vCores":1},"userAMResourceLimit":{"memory":5120,"vCores":1},"preemptionDisabled":true}]}}}} 2021-10-20 23:24:59,559 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr % Total % Received % Xferd Average Speed Time Time Time Current 2021-10-20 23:24:59,559 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr Dload Upload Total Spent Left Speed 2021-10-20 23:24:59,559 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 2021-10-20 23:24:59,559 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 2021-10-20 23:24:59,559 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 100 1061 0 1061 0 0 127k 0 --:--:-- --:--:-- --:--:-- 148k 2021-10-20 23:24:59,559 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : Thread wait for executing command curl -k --negotiate -u : "http://henghe-031:8088/ws/v1/cluster/scheduler" 2021-10-20 23:24:59,582 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr % Total % Received % Xferd Average Speed Time Time Time Current 2021-10-20 23:24:59,582 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr Dload Upload Total Spent Left Speed 2021-10-20 23:24:59,582 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 2021-10-20 23:24:59,582 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed connect to henghe-030:8088; Connection refused 2021-10-20 23:24:59,582 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : Thread wait for executing command curl -k --negotiate -u : "http://henghe-030:8088/ws/v1/cluster/scheduler" 2021-10-20 23:24:59,589 INFO [pool-1-thread-1] parser.CapacitySchedulerParser : Queue available capacity: 0.74800894. 2021-10-20 23:24:59,589 INFO [pool-1-thread-1] parser.CapacitySchedulerParser : Estimate total cluster resource is ResourceInfo(44699,2147483647). 2021-10-20 23:24:59,590 INFO [pool-1-thread-1] parser.CapacitySchedulerParser : Cluster available capacity: 0.74800894. 2021-10-20 23:24:59,590 INFO [pool-1-thread-1] parser.CapacitySchedulerParser : Capacity actual available resource: AvailableResource(ResourceInfo(33435,1606336966),ResourceInfo(44699,2147483647)). 2021-10-20 23:24:59,591 WARN [pool-1-thread-1] sql.SparkSession$Builder : Using an existing SparkSession; the static sql configurations will not take effect. 2021-10-20 23:24:59,591 WARN [pool-1-thread-1] sql.SparkSession$Builder : Using an existing SparkSession; some spark core configurations may not take effect. 2021-10-20 23:24:59,629 INFO [pool-1-thread-1] job.CubeBuildJob : Start building cube job for 4cf9a91e-67c7-9a04-7fec-d2732f5353a4 ... 2021-10-20 23:24:59,630 INFO [pool-1-thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.cube.CubeManager 2021-10-20 23:24:59,630 INFO [pool-1-thread-1] cube.CubeManager : Initializing CubeManager with config kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta 2021-10-20 23:24:59,630 INFO [pool-1-thread-1] persistence.ResourceStore : Using metadata url kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta for resource store 2021-10-20 23:24:59,647 INFO [pool-1-thread-1] persistence.HDFSResourceStore : hdfs meta path : hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta 2021-10-20 23:24:59,648 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Reloading CubeInstance from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/cube 2021-10-20 23:24:59,653 INFO [pool-1-thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.cube.CubeDescManager 2021-10-20 23:24:59,653 INFO [pool-1-thread-1] cube.CubeDescManager : Initializing CubeDescManager with config kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta 2021-10-20 23:24:59,653 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Reloading CubeDesc from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/cube_desc 2021-10-20 23:24:59,656 INFO [pool-1-thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.metadata.project.ProjectManager 2021-10-20 23:24:59,656 INFO [pool-1-thread-1] project.ProjectManager : Initializing ProjectManager with metadata url kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta 2021-10-20 23:24:59,657 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Reloading ProjectInstance from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/project 2021-10-20 23:24:59,663 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Loaded 1 ProjectInstance(s) out of 1 resource with 0 errors 2021-10-20 23:24:59,663 INFO [pool-1-thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.metadata.cachesync.Broadcaster 2021-10-20 23:24:59,663 DEBUG [pool-1-thread-1] cachesync.Broadcaster : 1 nodes in the cluster: [localhost:7070] 2021-10-20 23:24:59,664 INFO [pool-1-thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.metadata.model.DataModelManager 2021-10-20 23:24:59,664 INFO [pool-1-thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.metadata.TableMetadataManager 2021-10-20 23:24:59,664 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Reloading TableDesc from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/table 2021-10-20 23:24:59,669 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Loaded 2 TableDesc(s) out of 2 resource with 0 errors 2021-10-20 23:24:59,669 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Reloading TableExtDesc from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/table_exd 2021-10-20 23:24:59,675 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Loaded 2 TableExtDesc(s) out of 2 resource with 0 errors 2021-10-20 23:24:59,675 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Reloading ExternalFilterDesc from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/ext_filter 2021-10-20 23:24:59,676 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Loaded 0 ExternalFilterDesc(s) out of 0 resource with 0 errors 2021-10-20 23:24:59,676 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Reloading DataModelDesc from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/model_desc 2021-10-20 23:24:59,681 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Loaded 1 DataModelDesc(s) out of 1 resource with 0 errors 2021-10-20 23:24:59,682 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Loaded 1 CubeDesc(s) out of 1 resource with 0 errors 2021-10-20 23:24:59,682 DEBUG [pool-1-thread-1] cachesync.CachedCrudAssist : Loaded 1 CubeInstance(s) out of 1 resource with 0 errors 2021-10-20 23:24:59,701 INFO [pool-1-thread-1] job.CubeBuildJob : There are 31 cuboids to be built in segment FULL_BUILD. 2021-10-20 23:24:59,701 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 7 has row keys: 2, 1, 0 2021-10-20 23:24:59,701 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 2 has row keys: 1 2021-10-20 23:24:59,701 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 24 has row keys: 4, 3 2021-10-20 23:24:59,701 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 14 has row keys: 3, 2, 1 2021-10-20 23:24:59,701 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 23 has row keys: 4, 2, 1, 0 2021-10-20 23:24:59,701 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 4 has row keys: 2 2021-10-20 23:24:59,701 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 16 has row keys: 4 2021-10-20 23:24:59,701 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 29 has row keys: 4, 3, 2, 0 2021-10-20 23:24:59,701 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 13 has row keys: 3, 2, 0 2021-10-20 23:24:59,701 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 26 has row keys: 4, 3, 1 2021-10-20 23:24:59,701 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 3 has row keys: 1, 0 2021-10-20 23:24:59,701 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 18 has row keys: 4, 1 2021-10-20 23:24:59,701 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 6 has row keys: 2, 1 2021-10-20 23:24:59,701 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 5 has row keys: 2, 0 2021-10-20 23:24:59,702 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 21 has row keys: 4, 2, 0 2021-10-20 23:24:59,702 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 8 has row keys: 3 2021-10-20 23:24:59,702 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 17 has row keys: 4, 0 2021-10-20 23:24:59,702 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 22 has row keys: 4, 2, 1 2021-10-20 23:24:59,702 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 20 has row keys: 4, 2 2021-10-20 23:24:59,702 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 1 has row keys: 0 2021-10-20 23:24:59,702 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 11 has row keys: 3, 1, 0 2021-10-20 23:24:59,702 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 31 has row keys: 4, 3, 2, 1, 0 2021-10-20 23:24:59,702 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 10 has row keys: 3, 1 2021-10-20 23:24:59,702 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 12 has row keys: 3, 2 2021-10-20 23:24:59,702 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 15 has row keys: 3, 2, 1, 0 2021-10-20 23:24:59,702 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 19 has row keys: 4, 1, 0 2021-10-20 23:24:59,702 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 28 has row keys: 4, 3, 2 2021-10-20 23:24:59,702 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 9 has row keys: 3, 0 2021-10-20 23:24:59,702 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 25 has row keys: 4, 3, 0 2021-10-20 23:24:59,702 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 27 has row keys: 4, 3, 1, 0 2021-10-20 23:24:59,702 DEBUG [pool-1-thread-1] job.CubeBuildJob : Cuboid 30 has row keys: 4, 3, 2, 1 2021-10-20 23:24:59,703 INFO [pool-1-thread-1] metastore.HiveMetaStore : 0: get_database: default 2021-10-20 23:24:59,703 INFO [pool-1-thread-1] HiveMetaStore.audit : ugi=root ip=unknown-ip-addr cmd=get_database: default 2021-10-20 23:24:59,709 INFO [pool-1-thread-1] metastore.HiveMetaStore : 0: get_table : db=default tbl=dept 2021-10-20 23:24:59,709 INFO [pool-1-thread-1] HiveMetaStore.audit : ugi=root ip=unknown-ip-addr cmd=get_table : db=default tbl=dept 2021-10-20 23:24:59,726 INFO [pool-1-thread-1] metastore.HiveMetaStore : 0: get_table : db=default tbl=dept 2021-10-20 23:24:59,726 INFO [pool-1-thread-1] HiveMetaStore.audit : ugi=root ip=unknown-ip-addr cmd=get_table : db=default tbl=dept 2021-10-20 23:24:59,742 INFO [pool-1-thread-1] source.HiveSource : Source data sql is: select DEPTNO,DNAME,LOC from DEFAULT.DEPT 2021-10-20 23:24:59,742 INFO [pool-1-thread-1] source.HiveSource : Kylin schema root |-- DEPTNO: integer (nullable = true) |-- DNAME: string (nullable = true) |-- LOC: integer (nullable = true) 2021-10-20 23:24:59,787 INFO [pool-1-thread-1] mapred.FileInputFormat : Total input paths to process : 3 2021-10-20 23:25:00,029 INFO [pool-1-thread-1] mapred.FileInputFormat : Total input paths to process : 3 2021-10-20 23:25:00,276 INFO [pool-1-thread-1] job.CubeBuildJob : Building job takes 647 ms 2021-10-20 23:25:00,276 ERROR [pool-1-thread-1] application.SparkApplication : The spark job execute failed! java.lang.IllegalStateException: Failed to build lookup table DEPT snapshot for Dup key found, key= DEPTNO at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:198) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:190) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder.checkDupKey(CubeSnapshotBuilder.scala:189) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideFlatTableSource(ParentSourceChooser.scala:83) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:71) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:66) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideSources(ParentSourceChooser.scala:66) at org.apache.kylin.engine.spark.job.CubeBuildJob.doExecute(CubeBuildJob.java:178) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:304) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:93) at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 2021-10-20 23:25:00,276 ERROR [pool-1-thread-1] application.JobMonitor : Job failed the 3 times. java.lang.RuntimeException: Error execute org.apache.kylin.engine.spark.job.CubeBuildJob at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:96) at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IllegalStateException: Failed to build lookup table DEPT snapshot for Dup key found, key= DEPTNO at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:198) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:190) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder.checkDupKey(CubeSnapshotBuilder.scala:189) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideFlatTableSource(ParentSourceChooser.scala:83) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:71) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:66) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideSources(ParentSourceChooser.scala:66) at org.apache.kylin.engine.spark.job.CubeBuildJob.doExecute(CubeBuildJob.java:178) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:304) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:93) ... 4 more 2021-10-20 23:25:00,293 INFO [pool-1-thread-1] application.SparkApplication : Executor task org.apache.kylin.engine.spark.job.CubeBuildJob with args : {"distMetaUrl":"kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta","submitter":"ADMIN","dataRangeEnd":"9223372036854775807","targetModel":"31115e9c-7baa-61e6-2997-378ff63a38ec","dataRangeStart":"0","project":"empTest","className":"org.apache.kylin.engine.spark.job.CubeBuildJob","segmentName":"FULL_BUILD","parentId":"b532af1a-abef-4d0f-8b29-450744d499fc","jobId":"b532af1a-abef-4d0f-8b29-450744d499fc","outputMetaUrl":"kylin_metadata@jdbc,url=jdbc:mysql://localhost:3306/kylin,username=root,password=******,maxActive=10,maxIdle=10","segmentId":"4cf9a91e-67c7-9a04-7fec-d2732f5353a4","cuboidsNum":"31","cubeName":"testCube","jobType":"BUILD","cubeId":"30c0a683-9424-2157-5b51-43991cf95e4e","segmentIds":"4cf9a91e-67c7-9a04-7fec-d2732f5353a4"} 2021-10-20 23:25:00,293 INFO [pool-1-thread-1] utils.MetaDumpUtil : Ready to load KylinConfig from uri: kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta 2021-10-20 23:25:00,365 INFO [pool-1-thread-1] common.KylinConfigBase : Kylin Config was updated with kylin.metadata.url.identifier : kylin_metadata 2021-10-20 23:25:00,366 INFO [pool-1-thread-1] common.KylinConfigBase : Kylin Config was updated with kylin.log.spark-executor-properties-file : /opt/kylin/conf/spark-executor-log4j.properties 2021-10-20 23:25:00,366 INFO [pool-1-thread-1] common.KylinConfigBase : Kylin Config was updated with kylin.source.provider.0 : org.apache.kylin.engine.spark.source.HiveSource 2021-10-20 23:25:00,366 INFO [pool-1-thread-1] util.TimeZoneUtils : System timezone set to America/New_York, TimeZoneId: America/New_York. 2021-10-20 23:25:00,367 INFO [pool-1-thread-1] application.SparkApplication : Sleep for random seconds to avoid submitting too many spark job at the same time. 2021-10-20 23:25:20,183 INFO [Thread-1] cluster.SchedulerInfoCmdHelper : stdout {"scheduler":{"schedulerInfo":{"type":"capacityScheduler","capacity":100.0,"usedCapacity":25.199106,"maxCapacity":100.0,"queueName":"root","queues":{"queue":[{"type":"capacitySchedulerLeafQueueInfo","capacity":100.0,"usedCapacity":25.199106,"maxCapacity":100.0,"absoluteCapacity":100.0,"absoluteMaxCapacity":100.0,"absoluteUsedCapacity":25.199106,"numApplications":1,"queueName":"default","state":"RUNNING","resourcesUsed":{"memory":11264,"vCores":6},"hideReservationQueues":false,"nodeLabels":["*"],"numActiveApplications":1,"numPendingApplications":0,"numContainers":6,"maxApplications":10000,"maxApplicationsPerUser":10000,"userLimit":100,"users":{"user":[{"username":"root","resourcesUsed":{"memory":11264,"vCores":6},"numPendingApplications":0,"numActiveApplications":1,"AMResourceUsed":{"memory":1024,"vCores":1},"userResourceLimit":{"memory":45056,"vCores":1}}]},"userLimitFactor":1.0,"AMResourceLimit":{"memory":5120,"vCores":1},"usedAMResource":{"memory":1024,"vCores":1},"userAMResourceLimit":{"memory":5120,"vCores":1},"preemptionDisabled":true}]}}}} 2021-10-20 23:25:20,184 INFO [Thread-1] cluster.SchedulerInfoCmdHelper : stderr % Total % Received % Xferd Average Speed Time Time Time Current 2021-10-20 23:25:20,184 INFO [Thread-1] cluster.SchedulerInfoCmdHelper : stderr Dload Upload Total Spent Left Speed 2021-10-20 23:25:20,184 INFO [Thread-1] cluster.SchedulerInfoCmdHelper : stderr 2021-10-20 23:25:20,184 INFO [Thread-1] cluster.SchedulerInfoCmdHelper : stderr 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 2021-10-20 23:25:20,184 INFO [Thread-1] cluster.SchedulerInfoCmdHelper : stderr 100 1061 0 1061 0 0 130k 0 --:--:-- --:--:-- --:--:-- 148k 2021-10-20 23:25:20,184 INFO [Thread-1] cluster.SchedulerInfoCmdHelper : Thread wait for executing command curl -k --negotiate -u : "http://henghe-031:8088/ws/v1/cluster/scheduler" 2021-10-20 23:25:20,203 INFO [Thread-1] cluster.SchedulerInfoCmdHelper : stderr % Total % Received % Xferd Average Speed Time Time Time Current 2021-10-20 23:25:20,203 INFO [Thread-1] cluster.SchedulerInfoCmdHelper : stderr Dload Upload Total Spent Left Speed 2021-10-20 23:25:20,203 INFO [Thread-1] cluster.SchedulerInfoCmdHelper : stderr 2021-10-20 23:25:20,203 INFO [Thread-1] cluster.SchedulerInfoCmdHelper : stderr 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed connect to henghe-030:8088; Connection refused 2021-10-20 23:25:20,203 INFO [Thread-1] cluster.SchedulerInfoCmdHelper : Thread wait for executing command curl -k --negotiate -u : "http://henghe-030:8088/ws/v1/cluster/scheduler" 2021-10-20 23:25:20,210 INFO [Thread-1] parser.CapacitySchedulerParser : Queue available capacity: 0.74800894. 2021-10-20 23:25:20,210 INFO [Thread-1] parser.CapacitySchedulerParser : Estimate total cluster resource is ResourceInfo(44699,2147483647). 2021-10-20 23:25:20,210 INFO [Thread-1] parser.CapacitySchedulerParser : Cluster available capacity: 0.74800894. 2021-10-20 23:25:20,210 INFO [Thread-1] parser.CapacitySchedulerParser : Capacity actual available resource: AvailableResource(ResourceInfo(33435,1606336966),ResourceInfo(44699,2147483647)). 2021-10-20 23:25:20,211 WARN [Thread-1] sql.SparkSession$Builder : Using an existing SparkSession; the static sql configurations will not take effect. 2021-10-20 23:25:20,211 WARN [Thread-1] sql.SparkSession$Builder : Using an existing SparkSession; some spark core configurations may not take effect. 2021-10-20 23:25:20,250 INFO [Thread-1] job.CubeBuildJob : Start building cube job for 4cf9a91e-67c7-9a04-7fec-d2732f5353a4 ... 2021-10-20 23:25:20,250 INFO [Thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.cube.CubeManager 2021-10-20 23:25:20,250 INFO [Thread-1] cube.CubeManager : Initializing CubeManager with config kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta 2021-10-20 23:25:20,250 INFO [Thread-1] persistence.ResourceStore : Using metadata url kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta for resource store 2021-10-20 23:25:20,267 INFO [Thread-1] persistence.HDFSResourceStore : hdfs meta path : hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta 2021-10-20 23:25:20,268 DEBUG [Thread-1] cachesync.CachedCrudAssist : Reloading CubeInstance from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/cube 2021-10-20 23:25:20,272 INFO [Thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.cube.CubeDescManager 2021-10-20 23:25:20,272 INFO [Thread-1] cube.CubeDescManager : Initializing CubeDescManager with config kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta 2021-10-20 23:25:20,272 DEBUG [Thread-1] cachesync.CachedCrudAssist : Reloading CubeDesc from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/cube_desc 2021-10-20 23:25:20,276 INFO [Thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.metadata.project.ProjectManager 2021-10-20 23:25:20,276 INFO [Thread-1] project.ProjectManager : Initializing ProjectManager with metadata url kylin_metadata@hdfs,path=hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta 2021-10-20 23:25:20,276 DEBUG [Thread-1] cachesync.CachedCrudAssist : Reloading ProjectInstance from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/project 2021-10-20 23:25:20,280 DEBUG [Thread-1] cachesync.CachedCrudAssist : Loaded 1 ProjectInstance(s) out of 1 resource with 0 errors 2021-10-20 23:25:20,280 INFO [Thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.metadata.cachesync.Broadcaster 2021-10-20 23:25:20,280 DEBUG [Thread-1] cachesync.Broadcaster : 1 nodes in the cluster: [localhost:7070] 2021-10-20 23:25:20,281 INFO [Thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.metadata.model.DataModelManager 2021-10-20 23:25:20,281 INFO [Thread-1] common.KylinConfig : Creating new manager instance of class org.apache.kylin.metadata.TableMetadataManager 2021-10-20 23:25:20,281 DEBUG [Thread-1] cachesync.CachedCrudAssist : Reloading TableDesc from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/table 2021-10-20 23:25:20,286 DEBUG [Thread-1] cachesync.CachedCrudAssist : Loaded 2 TableDesc(s) out of 2 resource with 0 errors 2021-10-20 23:25:20,286 DEBUG [Thread-1] cachesync.CachedCrudAssist : Reloading TableExtDesc from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/table_exd 2021-10-20 23:25:20,290 DEBUG [Thread-1] cachesync.CachedCrudAssist : Loaded 2 TableExtDesc(s) out of 2 resource with 0 errors 2021-10-20 23:25:20,290 DEBUG [Thread-1] cachesync.CachedCrudAssist : Reloading ExternalFilterDesc from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/ext_filter 2021-10-20 23:25:20,291 DEBUG [Thread-1] cachesync.CachedCrudAssist : Loaded 0 ExternalFilterDesc(s) out of 0 resource with 0 errors 2021-10-20 23:25:20,291 DEBUG [Thread-1] cachesync.CachedCrudAssist : Reloading DataModelDesc from hdfs://master/kylin/kylin_metadata/empTest/job_tmp/b532af1a-abef-4d0f-8b29-450744d499fc-01/meta/model_desc 2021-10-20 23:25:20,294 DEBUG [Thread-1] cachesync.CachedCrudAssist : Loaded 1 DataModelDesc(s) out of 1 resource with 0 errors 2021-10-20 23:25:20,294 DEBUG [Thread-1] cachesync.CachedCrudAssist : Loaded 1 CubeDesc(s) out of 1 resource with 0 errors 2021-10-20 23:25:20,294 DEBUG [Thread-1] cachesync.CachedCrudAssist : Loaded 1 CubeInstance(s) out of 1 resource with 0 errors 2021-10-20 23:25:20,315 INFO [Thread-1] job.CubeBuildJob : There are 31 cuboids to be built in segment FULL_BUILD. 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 14 has row keys: 3, 2, 1 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 15 has row keys: 3, 2, 1, 0 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 28 has row keys: 4, 3, 2 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 12 has row keys: 3, 2 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 26 has row keys: 4, 3, 1 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 29 has row keys: 4, 3, 2, 0 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 24 has row keys: 4, 3 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 3 has row keys: 1, 0 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 6 has row keys: 2, 1 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 27 has row keys: 4, 3, 1, 0 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 20 has row keys: 4, 2 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 18 has row keys: 4, 1 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 22 has row keys: 4, 2, 1 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 31 has row keys: 4, 3, 2, 1, 0 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 19 has row keys: 4, 1, 0 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 8 has row keys: 3 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 5 has row keys: 2, 0 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 1 has row keys: 0 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 25 has row keys: 4, 3, 0 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 13 has row keys: 3, 2, 0 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 7 has row keys: 2, 1, 0 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 4 has row keys: 2 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 23 has row keys: 4, 2, 1, 0 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 21 has row keys: 4, 2, 0 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 2 has row keys: 1 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 11 has row keys: 3, 1, 0 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 9 has row keys: 3, 0 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 10 has row keys: 3, 1 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 17 has row keys: 4, 0 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 30 has row keys: 4, 3, 2, 1 2021-10-20 23:25:20,315 DEBUG [Thread-1] job.CubeBuildJob : Cuboid 16 has row keys: 4 2021-10-20 23:25:20,317 INFO [Thread-1] metastore.HiveMetaStore : 0: get_database: default 2021-10-20 23:25:20,317 INFO [Thread-1] HiveMetaStore.audit : ugi=root ip=unknown-ip-addr cmd=get_database: default 2021-10-20 23:25:20,320 INFO [Thread-1] metastore.HiveMetaStore : 0: get_table : db=default tbl=dept 2021-10-20 23:25:20,320 INFO [Thread-1] HiveMetaStore.audit : ugi=root ip=unknown-ip-addr cmd=get_table : db=default tbl=dept 2021-10-20 23:25:20,335 INFO [Thread-1] metastore.HiveMetaStore : 0: get_table : db=default tbl=dept 2021-10-20 23:25:20,335 INFO [Thread-1] HiveMetaStore.audit : ugi=root ip=unknown-ip-addr cmd=get_table : db=default tbl=dept 2021-10-20 23:25:20,355 INFO [Thread-1] source.HiveSource : Source data sql is: select DEPTNO,DNAME,LOC from DEFAULT.DEPT 2021-10-20 23:25:20,355 INFO [Thread-1] source.HiveSource : Kylin schema root |-- DEPTNO: integer (nullable = true) |-- DNAME: string (nullable = true) |-- LOC: integer (nullable = true) 2021-10-20 23:25:20,405 INFO [Thread-1] mapred.FileInputFormat : Total input paths to process : 3 2021-10-20 23:25:20,692 INFO [Thread-1] mapred.FileInputFormat : Total input paths to process : 3 2021-10-20 23:25:20,958 INFO [Thread-1] job.CubeBuildJob : Building job takes 708 ms 2021-10-20 23:25:20,959 ERROR [Thread-1] application.SparkApplication : The spark job execute failed! java.lang.IllegalStateException: Failed to build lookup table DEPT snapshot for Dup key found, key= DEPTNO at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:198) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:190) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder.checkDupKey(CubeSnapshotBuilder.scala:189) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideFlatTableSource(ParentSourceChooser.scala:83) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:71) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:66) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideSources(ParentSourceChooser.scala:66) at org.apache.kylin.engine.spark.job.CubeBuildJob.doExecute(CubeBuildJob.java:178) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:304) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:93) at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 2021-10-20 23:25:20,960 ERROR [Thread-1] application.JobWorkSpace : Job failed eventually. Reason: Retry times exceed MaxRetry set in the KylinConfig. java.lang.RuntimeException: Error execute org.apache.kylin.engine.spark.job.CubeBuildJob at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:96) at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IllegalStateException: Failed to build lookup table DEPT snapshot for Dup key found, key= DEPTNO at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:198) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:190) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder.checkDupKey(CubeSnapshotBuilder.scala:189) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideFlatTableSource(ParentSourceChooser.scala:83) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:71) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:66) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideSources(ParentSourceChooser.scala:66) at org.apache.kylin.engine.spark.job.CubeBuildJob.doExecute(CubeBuildJob.java:178) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:304) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:93) ... 4 more 2021-10-20 23:25:20,965 INFO [Thread-1] application.SparkApplication : ==========================[BUILD CUBE]=============================== auto spark config :{spark.executor.memory=1GB, count_distinct=false, spark.executor.cores=1, spark.executor.memoryOverhead=512MB, spark.executor.instances=5, spark.yarn.queue=default, spark.sql.shuffle.partitions=2} wait time: 0 build time: 709 build from layouts : build from flat table : cuboids num per segment : {} abnormal layouts : {} retry times : 4 job retry infos : RetryInfo{ overrideConf : {spark.executor.memory=1536MB, spark.executor.memoryOverhead=308MB}, throwable : java.lang.RuntimeException: Error execute org.apache.kylin.engine.spark.job.CubeBuildJob at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:96) at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IllegalStateException: Failed to build lookup table DEPT snapshot for Dup key found, key= DEPTNO at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:198) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:190) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder.checkDupKey(CubeSnapshotBuilder.scala:189) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideFlatTableSource(ParentSourceChooser.scala:83) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:71) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:66) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideSources(ParentSourceChooser.scala:66) at org.apache.kylin.engine.spark.job.CubeBuildJob.doExecute(CubeBuildJob.java:178) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:304) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:93) ... 4 more } RetryInfo{ overrideConf : {spark.executor.memory=2304MB, spark.executor.memoryOverhead=461MB}, throwable : java.lang.RuntimeException: Error execute org.apache.kylin.engine.spark.job.CubeBuildJob at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:96) at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IllegalStateException: Failed to build lookup table DEPT snapshot for Dup key found, key= DEPTNO at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:198) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:190) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder.checkDupKey(CubeSnapshotBuilder.scala:189) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideFlatTableSource(ParentSourceChooser.scala:83) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:71) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:66) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideSources(ParentSourceChooser.scala:66) at org.apache.kylin.engine.spark.job.CubeBuildJob.doExecute(CubeBuildJob.java:178) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:304) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:93) ... 4 more } RetryInfo{ overrideConf : {spark.executor.memory=3456MB, spark.executor.memoryOverhead=692MB}, throwable : java.lang.RuntimeException: Error execute org.apache.kylin.engine.spark.job.CubeBuildJob at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:96) at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IllegalStateException: Failed to build lookup table DEPT snapshot for Dup key found, key= DEPTNO at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:198) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:190) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder.checkDupKey(CubeSnapshotBuilder.scala:189) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideFlatTableSource(ParentSourceChooser.scala:83) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:71) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:66) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideSources(ParentSourceChooser.scala:66) at org.apache.kylin.engine.spark.job.CubeBuildJob.doExecute(CubeBuildJob.java:178) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:304) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:93) ... 4 more } RetryInfo{ overrideConf : {}, throwable : java.lang.RuntimeException: Error execute org.apache.kylin.engine.spark.job.CubeBuildJob at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:96) at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IllegalStateException: Failed to build lookup table DEPT snapshot for Dup key found, key= DEPTNO at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:198) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:190) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder.checkDupKey(CubeSnapshotBuilder.scala:189) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideFlatTableSource(ParentSourceChooser.scala:83) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:71) at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:66) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideSources(ParentSourceChooser.scala:66) at org.apache.kylin.engine.spark.job.CubeBuildJob.doExecute(CubeBuildJob.java:178) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:304) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:93) ... 4 more } ==========================[BUILD CUBE]===============================