Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Duplicate
-
Impala 2.5.0
Description
The stress test will often print many of the following error:
11:34:00 Process Process-84: 11:34:00 Traceback (most recent call last): 11:34:00 File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap 11:34:00 self.run() 11:34:00 File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run 11:34:00 self._target(*self._args, **self._kwargs) 11:34:00 File "tests/stress/concurrent_select.py", line 613, in _start_single_runner 11:34:00 raise Exception("Query failed: %s" % str(report.non_mem_limit_error)) 11:34:00 Exception: Query failed: 11:34:00 Couldn't get a client for impala-stress-cdh5-trunk2-5.vpc.cloudera.com:22000 Reason: Couldn't open transport for impala-stress-cdh5-trunk2-5.vpc.cloudera.com:22000 (connect() failed: Connection timed out)
e.g. http://sandbox.jenkins.cloudera.com/job/Impala-Stress-Test-EC2-CDH5-trunk/621/console
Usually this will fail the job, but occasionally it will recover and keep going (although the error may show up again).
It's hard to catch it exactly when this happens, but I've seen 40+ queries running on the impalads after this occurs.
We need to investigate exactly what is causing this, and then decide what to do about it. This is currently failing a large proportion of stress jobs.
Attachments
Issue Links
- blocks
-
IMPALA-3186 Stress test crash: wait until impalad is quiesced before starting next test: [libstdc++.so.6+0x5cad0] __dynamic_cast+0x50
- Resolved