Uploaded image for project: 'Kylin'
  1. Kylin
  2. KYLIN-4017

Build engine get zk(zookeeper) lock failed when building job, it causes the whole build engine doesn't work.

    XMLWordPrintableJSON

Details

    • Important

    Description

      Kylin has ZK acquisition lock exception when it is building job. Only restart can solve this problem. Otherwise, it can't build job ,the whole build engine doesn't work.This problem will continue to occur one day after restart. Log looks like below:

      2019-05-15 11:09:43,209 INFO [FetcherRunner 1910115020-57] threadpool.FetcherRunner:59 : CubingJob{id=878974c4-4c65-88a4-a912-b238fcc33bdc, name=BUILD CUBE - es_report_respnse_rate_cube - 20190513000000_20190514000000 - GMT+08:00 2019-05-15 11:03:15, state=READY} prepare to schedule and its priority is 20
      2019-05-15 11:09:43,209 INFO [FetcherRunner 1910115020-57] threadpool.FetcherRunner:63 : CubingJob{id=878974c4-4c65-88a4-a912-b238fcc33bdc, name=BUILD CUBE - es_report_respnse_rate_cube - 20190513000000_20190514000000 - GMT+08:00 2019-05-15 11:03:15, state=READY} scheduled
      2019-05-15 11:09:43,209 DEBUG [Scheduler 719764581 Job 878974c4-4c65-88a4-a912-b238fcc33bdc-132] zookeeper.ZookeeperDistributedLock:92 : 18786@bigdata-kylin-build01.gz01.diditaxi.com trying to lock /job_engine/lock/878974c4-4c65-88a4-a912-b238fcc33bdc
      2019-05-15 11:09:43,212 ERROR [pool-12-thread-10] threadpool.DistributedScheduler:115 : unknown error execute job:878974c4-4c65-88a4-a912-b238fcc33bdc in server: 18786@bigdata-kylin-build01.gz01.diditaxi.com
      java.lang.IllegalStateException: Error while 18786@bigdata-kylin-build01.gz01.diditaxi.com trying to lock /job_engine/lock/878974c4-4c65-88a4-a912-b238fcc33bdc
       at org.apache.kylin.job.lock.zookeeper.ZookeeperDistributedLock.lock(ZookeeperDistributedLock.java:99)
       at org.apache.kylin.job.lock.zookeeper.ZookeeperJobLock.lock(ZookeeperJobLock.java:41)
       at org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:105)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       at java.lang.Thread.run(Thread.java:745)
      Caused by: java.lang.IllegalStateException: instance must be started before calling this method
       at org.apache.curator.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:176)
       at org.apache.curator.framework.imps.CuratorFrameworkImpl.create(CuratorFrameworkImpl.java:351)
       at org.apache.kylin.job.lock.zookeeper.ZookeeperDistributedLock.lock(ZookeeperDistributedLock.java:95)
       ... 5 more

       

      Attachments

        1. zkinstancestart.png
          347 kB
          wangxiaojing

        Issue Links

          Activity

            People

              Unassigned Unassigned
              wangxiaojing wangxiaojing
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: