Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-1804

random failures while running large number of queries

    XMLWordPrintableJSON

Details

    Description

      #Tue Dec 02 14:38:34 EST 2014
      git.commit.id.abbrev=757e9a2

      Running Mondrian regression tests, out of over 6000 queries, sometimes I get one or two random failures. Here is the stack when it happens:

      2014-12-02 17:49:32,271 [2b8193d3-f0ca-aa7c-094a-d8234d76d068:foreman] ERROR o.a.drill.exec.work.foreman.Foreman - Error aeae057b-ed0a-43aa-902d-fe3a41531511: Query failed: Unexpected exception during fragment initialization.
      org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception during fragment initialization.
      at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:194) [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
      at org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:254) [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_45]
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_45]
      at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
      Caused by: java.lang.RuntimeException: Failure while accessing Zookeeper. Failure while accessing Zookeeper
      at org.apache.drill.exec.store.sys.zk.ZkAbstractStore.put(ZkAbstractStore.java:111) ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
      at org.apache.drill.exec.work.foreman.QueryStatus.updateQueryStateInStore(QueryStatus.java:132) ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
      at org.apache.drill.exec.work.foreman.Foreman.recordNewState(Foreman.java:502) [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
      at org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:396) [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
      at org.apache.drill.exec.work.foreman.Foreman.runPhysicalPlan(Foreman.java:311) [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
      at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:510) [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
      at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:185) [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
      ... 4 common frames omitted
      Caused by: java.lang.RuntimeException: Failure while accessing Zookeeper
      at org.apache.drill.exec.store.sys.zk.ZkEStore.createNodeInZK(ZkEStore.java:53) ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
      at org.apache.drill.exec.store.sys.zk.ZkAbstractStore.put(ZkAbstractStore.java:106) ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
      ... 10 common frames omitted
      Caused by: org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /drill/running/2b8193d3-f0ca-aa7c-094a-d8234d76d068
      at org.apache.zookeeper.KeeperException.create(KeeperException.java:119) ~[zookeeper-3.4.5-mapr-1406.jar:3.4.5-mapr-1406--1]
      at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) ~[zookeeper-3.4.5-mapr-1406.jar:3.4.5-mapr-1406--1]
      at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) ~[zookeeper-3.4.5-mapr-1406.jar:3.4.5-mapr-1406--1]
      at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:676) ~[curator-framework-2.5.0.jar:na]
      at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:660) ~[curator-framework-2.5.0.jar:na]
      at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) ~[curator-client-2.5.0.jar:na]
      at org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:656) ~[curator-framework-2.5.0.jar:na]
      at org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:441) ~[curator-framework-2.5.0.jar:na]
      at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:431) ~[curator-framework-2.5.0.jar:na]
      at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:44) ~[curator-framework-2.5.0.jar:na]
      at org.apache.drill.exec.store.sys.zk.ZkEStore.createNodeInZK(ZkEStore.java:51) ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
      ... 11 common frames omitted
      2014-12-02 17:49:32,287 [2b8193d3-f0ca-aa7c-094a-d8234d76d068:frag:0:0] WARN o.a.d.e.p.impl.SendingAccountor - Failure while waiting for send complete.
      java.lang.InterruptedException: null
      at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1301) ~[na:1.7.0_45]
      at java.util.concurrent.Semaphore.acquire(Semaphore.java:472) ~[na:1.7.0_45]
      at org.apache.drill.exec.physical.impl.SendingAccountor.waitForSendComplete(SendingAccountor.java:44) ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]

      Attachments

        Activity

          People

            cchang@maprtech.com Chun Chang
            cchang@maprtech.com Chun Chang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: