Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3684

breakpad tests fail when enabled and run in Jenkins

    XMLWordPrintableJSON

Details

    Description

      When breakpad tests get enabled in Jenkins, they fail. I'm able to trigger this both on debug / exhaustive builds and tests and release / exhaustive builds and tests.

      http://sandbox.jenkins.cloudera.com/job/impala-umbrella-build-and-test/1403/
      http://sandbox.jenkins.cloudera.com/job/impala-umbrella-build-and-test/1404/

      Because of IMPALA-3630 this doesn't show up in our CI; instead, it only shows up in private builds that have the IMPALA-3630 fix (https://gerrit.cloudera.org/#/c/3307/).

      Console log showing failed for the debug / exhaustive run:

      custom_cluster/test_alloc_fail.py ..
      custom_cluster/test_breakpad.py FFEFEFEF
      custom_cluster/test_delegation.py ...
      

      ...and for the release / exhaustive run:

      custom_cluster/test_alloc_fail.py ss
      custom_cluster/test_breakpad.py FFFFF
      custom_cluster/test_delegation.py ...
      

      The actual test failures can be classified in a few categories.

      1. Failed assertions in assert_all_processes_killed() in which Impala still seems to be up.

          def assert_all_processes_killed(self):
            self.cluster.refresh()
      >     assert not self.cluster.impalads
      E     assert not [<tests.common.impala_cluster.ImpaladProcess object at 0x3036590>, <tests.common.impala_cluster.ImpaladProcess object ...n.impala_cluster.ImpaladProcess object at 0x315f710>, <tests.common.impala_cluster.ImpaladProcess object at 0x315f890>]
      E      +  where [<tests.common.impala_cluster.ImpaladProcess object at 0x3036590>, <tests.common.impala_cluster.ImpaladProcess object ...n.impala_cluster.ImpaladProcess object at 0x315f710>, <tests.common.impala_cluster.ImpaladProcess object at 0x315f890>] = <tests.common.impala_cluster.ImpalaCluster object at 0x34b5050>.impalads
      E      +    where <tests.common.impala_cluster.ImpalaCluster object at 0x34b5050> = <test_breakpad.TestBreakpad object at 0x34b50d0>.cluster
      
      custom_cluster/test_breakpad.py:85: AssertionError
      

      2. Failures in teardown_method() ultimately linked to some Impala component not being killed within 4 minutes.

      Example:

      ________ ERROR at teardown of TestBreakpad.test_minidump_relative_path _________
      
      self = <test_breakpad.TestBreakpad object at 0x38973d0>
      method = <bound method TestBreakpad.test_minidump_relative_path of <test_breakpad.TestBreakpad object at 0x38973d0>>
      
          def teardown_method(self, method):
            # Override parent
            # Stop the cluster to prevent future accesses to self.tmp_dir.
      >     self._stop_impala_cluster()
      
      custom_cluster/test_breakpad.py:48: 
      _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
      common/custom_cluster_test_suite.py:103: in _stop_impala_cluster
          check_call([os.path.join(IMPALA_HOME, 'bin/start-impala-cluster.py'), '--kill_only'])
      _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
      
      popenargs = (['/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py', '--kill_only'],)
      kwargs = {}, retcode = 1
      cmd = ['/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py', '--kill_only']
      
          def check_call(*popenargs, **kwargs):
              """Run command with arguments.  Wait for command to complete.  If
              the exit code was zero then return, otherwise raise
              CalledProcessError.  The CalledProcessError object will have the
              return code in the returncode attribute.
          
              The arguments are the same as for the Popen constructor.  Example:
          
              check_call(["ls", "-l"])
              """
              retcode = call(*popenargs, **kwargs)
              cmd = kwargs.get("args")
              if cmd is None:
                  cmd = popenargs[0]
              if retcode:
      >           raise CalledProcessError(retcode, cmd)
      E           CalledProcessError: Command '['/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py', '--kill_only']' returned non-zero exit status 1
      
      /usr/lib64/python2.6/subprocess.py:505: CalledProcessError
      ----------------------------- Captured stdout call -----------------------------
      Starting State Store logging to /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests/statestored.INFO
      Starting Catalog Service logging to /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
      Starting Impala Daemon logging to /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests/impalad.INFO
      Starting Impala Daemon logging to /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO
      Starting Impala Daemon logging to /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO
      Waiting for Catalog... Status: 49 DBs / 1051 tables (ready=True)
      Waiting for Catalog... Status: 49 DBs / 1051 tables (ready=True)
      Waiting for Catalog... Status: 49 DBs / 1051 tables (ready=True)
      Impala Cluster Running with 3 nodes.
      ----------------------------- Captured stderr call -----------------------------
      MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25000
      MainThread: Waiting for num_known_live_backends=3. Current value: 0
      MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25000
      MainThread: Waiting for num_known_live_backends=3. Current value: 0
      MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25000
      MainThread: Waiting for num_known_live_backends=3. Current value: 0
      MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25000
      MainThread: Waiting for num_known_live_backends=3. Current value: 2
      MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25000
      MainThread: Waiting for num_known_live_backends=3. Current value: 2
      MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25000
      MainThread: num_known_live_backends has reached value: 3
      MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25001
      MainThread: num_known_live_backends has reached value: 3
      MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25002
      MainThread: num_known_live_backends has reached value: 3
      MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      MainThread: Getting metric: statestore.live-backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25010
      MainThread: Metric 'statestore.live-backends' has reach desired value: 4
      MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25000
      MainThread: num_known_live_backends has reached value: 3
      MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25001
      MainThread: num_known_live_backends has reached value: 3
      MainThread: Getting num_known_live_backends from impala-boost-static-burst-slave-142d.vpc.cloudera.com:25002
      MainThread: num_known_live_backends has reached value: 3
      MainThread: Attempting to find PID for /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/build/latest/service/impalad --mem_limit=19206662826 -log_filename=impalad -log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests -v=1 -logbufsecs=5 -max_log_files=0 -beeswax_port=21000 -hs2_port=21050 -be_port=22000 -state_store_subscriber_port=23000 -webserver_port=25000 -llama_callback_port=28000
      MainThread: Killing: /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/build/latest/service/impalad --mem_limit=19206662826 -log_filename=impalad -log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests -v=1 -logbufsecs=5 -max_log_files=0 -beeswax_port=21000 -hs2_port=21050 -be_port=22000 -state_store_subscriber_port=23000 -webserver_port=25000 -llama_callback_port=28000 (PID: 32446) with signal 11
      MainThread: Attempting to find PID for /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/build/latest/service/impalad --mem_limit=19206662826 -log_filename=impalad_node1 -log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests -v=1 -logbufsecs=5 -max_log_files=0 -beeswax_port=21001 -hs2_port=21051 -be_port=22001 -state_store_subscriber_port=23001 -webserver_port=25001 -llama_callback_port=28001
      MainThread: Killing: /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/build/latest/service/impalad --mem_limit=19206662826 -log_filename=impalad_node1 -log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests -v=1 -logbufsecs=5 -max_log_files=0 -beeswax_port=21001 -hs2_port=21051 -be_port=22001 -state_store_subscriber_port=23001 -webserver_port=25001 -llama_callback_port=28001 (PID: 32482) with signal 11
      MainThread: Attempting to find PID for /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/build/latest/service/impalad --mem_limit=19206662826 -log_filename=impalad_node2 -log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests -v=1 -logbufsecs=5 -max_log_files=0 -beeswax_port=21002 -hs2_port=21052 -be_port=22002 -state_store_subscriber_port=23002 -webserver_port=25002 -llama_callback_port=28002
      MainThread: Killing: /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/build/latest/service/impalad --mem_limit=19206662826 -log_filename=impalad_node2 -log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests -v=1 -logbufsecs=5 -max_log_files=0 -beeswax_port=21002 -hs2_port=21052 -be_port=22002 -state_store_subscriber_port=23002 -webserver_port=25002 -llama_callback_port=28002 (PID: 32527) with signal 11
      MainThread: Attempting to find PID for /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/build/latest/statestore/statestored -log_filename=statestored -log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests -v=1 -logbufsecs=5 -max_log_files=0
      MainThread: Killing: /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/build/latest/statestore/statestored -log_filename=statestored -log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests -v=1 -logbufsecs=5 -max_log_files=0 (PID: 32379) with signal 11
      MainThread: Attempting to find PID for /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/build/latest/catalog/catalogd -log_filename=catalogd -log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests -v=1 -logbufsecs=5 -max_log_files=0
      MainThread: Killing: /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/build/latest/catalog/catalogd -log_filename=catalogd -log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests -v=1 -logbufsecs=5 -max_log_files=0 (PID: 32386) with signal 11
      --------------------------- Captured stderr teardown ---------------------------
      Traceback (most recent call last):
        File "/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py", line 329, in <module>
          kill_cluster_processes(force=options.force_kill)
        File "/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py", line 134, in kill_cluster_processes
          kill_matching_processes(binaries, force)
        File "/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py", line 154, in kill_matching_processes
          process.pid, KILL_TIMEOUT_IN_SECONDS))
      RuntimeError: Unable to kill impalad (pid 32446) after 240 seconds.
      

      Attachments

        1. backtraces.txt
          53 kB
          Alexander Behm

        Issue Links

          Activity

            People

              lv Lars Volker
              mikeb Michael Brown
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: