Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Cannot Reproduce
-
Impala 2.7.0
-
None
Description
Running the stress test against Kudu results in a large number of queries timing out with the following error:
I0831 22:57:36.206172 24679 jni-util.cc:169] java.lang.RuntimeException: Loading Kudu Table failed at com.cloudera.impala.planner.KuduScanNode.computeScanRangeLocations(KuduScanNode.java:155) at com.cloudera.impala.planner.KuduScanNode.init(KuduScanNode.java:104) at com.cloudera.impala.planner.SingleNodePlanner.createScanNode(SingleNodePlanner.java:1257) at com.cloudera.impala.planner.SingleNodePlanner.createTableRefNode(SingleNodePlanner.java:1470) at com.cloudera.impala.planner.SingleNodePlanner.createTableRefsPlan(SingleNodePlanner.java:745) at com.cloudera.impala.planner.SingleNodePlanner.createSelectPlan(SingleNodePlanner.java:585) at com.cloudera.impala.planner.SingleNodePlanner.createQueryPlan(SingleNodePlanner.java:236) at com.cloudera.impala.planner.SingleNodePlanner.createSingleNodePlan(SingleNodePlanner.java:144) at com.cloudera.impala.planner.Planner.createPlan(Planner.java:62) at com.cloudera.impala.service.Frontend.createExecRequest(Frontend.java:975) at com.cloudera.impala.service.JniFrontend.createExecRequest(JniFrontend.java:150) Caused by: com.stumbleupon.async.TimeoutException: Timed out after 10000ms when joining Deferred@1666682109(state=PENDING, result=null, callback=get tablet locations from the master for table Kudu Master -> release master lookup permit -> retry RPC -> org.kududb.client.AsyncKuduClient$4@8e968ff -> wakeup thread Thread-55, errback=passthrough -> release master lookup permit -> retry RPC after error -> passthrough -> wakeup thread Thread-55) at com.stumbleupon.async.Deferred.doJoin(Deferred.java:1161) at com.stumbleupon.async.Deferred.join(Deferred.java:1029) at org.kududb.client.KuduClient.openTable(KuduClient.java:181) at com.cloudera.impala.planner.KuduScanNode.computeScanRangeLocations(KuduScanNode.java:119) ... 10 more
Another entry in the log that seems quite relevant is related to a long thread creation time:
W0831 22:52:59.616951 8714 thread.cc:502] negotiator [worker] (thread pool) Time spent creating pthread: real 37.363s user 0.000s sys 0.000s
The stress test was run in an EC2 8 node cluster with CDH 5.9.1 installed. The latest Kudu was installed using packages. OS version is Ubuntu 14.04. The stress test run TPC-H queries in scale factor 10.
Filing this JIRA for Impala now until the necessary changes in the stress test generator are checked in.