Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
Impala 2.6.0
Description
When breakpad tests get enabled in Jenkins, they fail. I'm able to trigger this both on debug / exhaustive builds and tests and release / exhaustive builds and tests.
http://sandbox.jenkins.cloudera.com/job/impala-umbrella-build-and-test/1403/
http://sandbox.jenkins.cloudera.com/job/impala-umbrella-build-and-test/1404/
Because of IMPALA-3630 this doesn't show up in our CI; instead, it only shows up in private builds that have the IMPALA-3630 fix (https://gerrit.cloudera.org/#/c/3307/).
Console log showing failed for the debug / exhaustive run:
custom_cluster/test_alloc_fail.py .. custom_cluster/test_breakpad.py FFEFEFEF custom_cluster/test_delegation.py ...
...and for the release / exhaustive run:
custom_cluster/test_alloc_fail.py ss custom_cluster/test_breakpad.py FFFFF custom_cluster/test_delegation.py ...
The actual test failures can be classified in a few categories.
1. Failed assertions in assert_all_processes_killed() in which Impala still seems to be up.
def assert_all_processes_killed(self): self.cluster.refresh() > assert not self.cluster.impalads E assert not [<tests.common.impala_cluster.ImpaladProcess object at 0x3036590>, <tests.common.impala_cluster.ImpaladProcess object ...n.impala_cluster.ImpaladProcess object at 0x315f710>, <tests.common.impala_cluster.ImpaladProcess object at 0x315f890>] E + where [<tests.common.impala_cluster.ImpaladProcess object at 0x3036590>, <tests.common.impala_cluster.ImpaladProcess object ...n.impala_cluster.ImpaladProcess object at 0x315f710>, <tests.common.impala_cluster.ImpaladProcess object at 0x315f890>] = <tests.common.impala_cluster.ImpalaCluster object at 0x34b5050>.impalads E + where <tests.common.impala_cluster.ImpalaCluster object at 0x34b5050> = <test_breakpad.TestBreakpad object at 0x34b50d0>.cluster custom_cluster/test_breakpad.py:85: AssertionError
2. Failures in teardown_method() ultimately linked to some Impala component not being killed within 4 minutes.
Example:
________ ERROR at teardown of TestBreakpad.test_minidump_relative_path _________ self = <test_breakpad.TestBreakpad object at 0x38973d0> method = <bound method TestBreakpad.test_minidump_relative_path of <test_breakpad.TestBreakpad object at 0x38973d0>> def teardown_method(self, method): # Override parent # Stop the cluster to prevent future accesses to self.tmp_dir. > self._stop_impala_cluster() custom_cluster/test_breakpad.py:48: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ common/custom_cluster_test_suite.py:103: in _stop_impala_cluster check_call([os.path.join(IMPALA_HOME, 'bin/start-impala-cluster.py'), '--kill_only']) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ popenargs = (['/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py', '--kill_only'],) kwargs = {}, retcode = 1 cmd = ['/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py', '--kill_only'] def check_call(*popenargs, **kwargs): """Run command with arguments. Wait for command to complete. If the exit code was zero then return, otherwise raise CalledProcessError. The CalledProcessError object will have the return code in the returncode attribute. The arguments are the same as for the Popen constructor. Example: check_call(["ls", "-l"]) """ retcode = call(*popenargs, **kwargs) cmd = kwargs.get("args") if cmd is None: cmd = popenargs[0] if retcode: > raise CalledProcessError(retcode, cmd) E CalledProcessError: Command '['/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py', '--kill_only']' returned non-zero exit status 1 /usr/lib64/python2.6/subprocess.py:505: CalledProcessError ----------------------------- Captured stdout call ----------------------------- Starting State Store logging to /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests/statestored.INFO Starting Catalog Service logging to /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests/catalogd.INFO Starting Impala Daemon logging to /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests/impalad.INFO Starting Impala Daemon logging to /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO Starting Impala Daemon logging to /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO Waiting for Catalog... Status: 49 DBs / 1051 tables (ready=True) Waiting for Catalog... Status: 49 DBs / 1051 tables (ready=True) Waiting for Catalog... Status: 49 DBs / 1051 tables (ready=True) Impala Cluster Running with 3 nodes. ----------------------------- Captured stderr call ----------------------------- MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25000 MainThread: Waiting for num_known_live_backends=3. Current value: 0 MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25000 MainThread: Waiting for num_known_live_backends=3. Current value: 0 MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25000 MainThread: Waiting for num_known_live_backends=3. Current value: 0 MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25000 MainThread: Waiting for num_known_live_backends=3. Current value: 2 MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25000 MainThread: Waiting for num_known_live_backends=3. Current value: 2 MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25000 MainThread: num_known_live_backends has reached value: 3 MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25001 MainThread: num_known_live_backends has reached value: 3 MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25002 MainThread: num_known_live_backends has reached value: 3 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) MainThread: Getting metric: statestore.live-backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25010 MainThread: Metric 'statestore.live-backends' has reach desired value: 4 MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25000 MainThread: num_known_live_backends has reached value: 3 MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25001 MainThread: num_known_live_backends has reached value: 3 MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25002 MainThread: num_known_live_backends has reached value: 3 MainThread: Attempting to find PID for /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/build/latest/service/impalad --mem_limit=19206662826 -log_filename=impalad -log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests -v=1 -logbufsecs=5 -max_log_files=0 -beeswax_port=21000 -hs2_port=21050 -be_port=22000 -state_store_subscriber_port=23000 -webserver_port=25000 -llama_callback_port=28000 MainThread: Killing: /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/build/latest/service/impalad --mem_limit=19206662826 -log_filename=impalad -log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests -v=1 -logbufsecs=5 -max_log_files=0 -beeswax_port=21000 -hs2_port=21050 -be_port=22000 -state_store_subscriber_port=23000 -webserver_port=25000 -llama_callback_port=28000 (PID: 32446) with signal 11 MainThread: Attempting to find PID for /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/build/latest/service/impalad --mem_limit=19206662826 -log_filename=impalad_node1 -log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests -v=1 -logbufsecs=5 -max_log_files=0 -beeswax_port=21001 -hs2_port=21051 -be_port=22001 -state_store_subscriber_port=23001 -webserver_port=25001 -llama_callback_port=28001 MainThread: Killing: /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/build/latest/service/impalad --mem_limit=19206662826 -log_filename=impalad_node1 -log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests -v=1 -logbufsecs=5 -max_log_files=0 -beeswax_port=21001 -hs2_port=21051 -be_port=22001 -state_store_subscriber_port=23001 -webserver_port=25001 -llama_callback_port=28001 (PID: 32482) with signal 11 MainThread: Attempting to find PID for /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/build/latest/service/impalad --mem_limit=19206662826 -log_filename=impalad_node2 -log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests -v=1 -logbufsecs=5 -max_log_files=0 -beeswax_port=21002 -hs2_port=21052 -be_port=22002 -state_store_subscriber_port=23002 -webserver_port=25002 -llama_callback_port=28002 MainThread: Killing: /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/build/latest/service/impalad --mem_limit=19206662826 -log_filename=impalad_node2 -log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests -v=1 -logbufsecs=5 -max_log_files=0 -beeswax_port=21002 -hs2_port=21052 -be_port=22002 -state_store_subscriber_port=23002 -webserver_port=25002 -llama_callback_port=28002 (PID: 32527) with signal 11 MainThread: Attempting to find PID for /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/build/latest/statestore/statestored -log_filename=statestored -log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests -v=1 -logbufsecs=5 -max_log_files=0 MainThread: Killing: /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/build/latest/statestore/statestored -log_filename=statestored -log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests -v=1 -logbufsecs=5 -max_log_files=0 (PID: 32379) with signal 11 MainThread: Attempting to find PID for /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/build/latest/catalog/catalogd -log_filename=catalogd -log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests -v=1 -logbufsecs=5 -max_log_files=0 MainThread: Killing: /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/build/latest/catalog/catalogd -log_filename=catalogd -log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests -v=1 -logbufsecs=5 -max_log_files=0 (PID: 32386) with signal 11 --------------------------- Captured stderr teardown --------------------------- Traceback (most recent call last): File "/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py", line 329, in <module> kill_cluster_processes(force=options.force_kill) File "/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py", line 134, in kill_cluster_processes kill_matching_processes(binaries, force) File "/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py", line 154, in kill_matching_processes process.pid, KILL_TIMEOUT_IN_SECONDS)) RuntimeError: Unable to kill impalad (pid 32446) after 240 seconds.
Attachments
Attachments
Issue Links
- relates to
-
IMPALA-3693 breakpad tests generate core files that get wrongly earmarked for collection
- Resolved