Uploaded image for project: 'Kylin'
  1. Kylin
  2. KYLIN-4017

Build engine get zk(zookeeper) lock failed when building job, it causes the whole build engine doesn't work.

    XMLWordPrintableJSON

    Details

    • Flags:
      Important

      Description

      Kylin has ZK acquisition lock exception when it is building job. Only restart can solve this problem. Otherwise, it can't build job ,the whole build engine doesn't work.This problem will continue to occur one day after restart. Log looks like below:

      2019-05-15 11:09:43,209 INFO [FetcherRunner 1910115020-57] threadpool.FetcherRunner:59 : CubingJob{id=878974c4-4c65-88a4-a912-b238fcc33bdc, name=BUILD CUBE - es_report_respnse_rate_cube - 20190513000000_20190514000000 - GMT+08:00 2019-05-15 11:03:15, state=READY} prepare to schedule and its priority is 20
      2019-05-15 11:09:43,209 INFO [FetcherRunner 1910115020-57] threadpool.FetcherRunner:63 : CubingJob{id=878974c4-4c65-88a4-a912-b238fcc33bdc, name=BUILD CUBE - es_report_respnse_rate_cube - 20190513000000_20190514000000 - GMT+08:00 2019-05-15 11:03:15, state=READY} scheduled
      2019-05-15 11:09:43,209 DEBUG [Scheduler 719764581 Job 878974c4-4c65-88a4-a912-b238fcc33bdc-132] zookeeper.ZookeeperDistributedLock:92 : 18786@bigdata-kylin-build01.gz01.diditaxi.com trying to lock /job_engine/lock/878974c4-4c65-88a4-a912-b238fcc33bdc
      2019-05-15 11:09:43,212 ERROR [pool-12-thread-10] threadpool.DistributedScheduler:115 : unknown error execute job:878974c4-4c65-88a4-a912-b238fcc33bdc in server: 18786@bigdata-kylin-build01.gz01.diditaxi.com
      java.lang.IllegalStateException: Error while 18786@bigdata-kylin-build01.gz01.diditaxi.com trying to lock /job_engine/lock/878974c4-4c65-88a4-a912-b238fcc33bdc
       at org.apache.kylin.job.lock.zookeeper.ZookeeperDistributedLock.lock(ZookeeperDistributedLock.java:99)
       at org.apache.kylin.job.lock.zookeeper.ZookeeperJobLock.lock(ZookeeperJobLock.java:41)
       at org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:105)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       at java.lang.Thread.run(Thread.java:745)
      Caused by: java.lang.IllegalStateException: instance must be started before calling this method
       at org.apache.curator.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:176)
       at org.apache.curator.framework.imps.CuratorFrameworkImpl.create(CuratorFrameworkImpl.java:351)
       at org.apache.kylin.job.lock.zookeeper.ZookeeperDistributedLock.lock(ZookeeperDistributedLock.java:95)
       ... 5 more

       

        Attachments

        1. zkinstancestart.png
          347 kB
          wangxiaojing

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                wangxiaojing wangxiaojing
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: