Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-17806

Flaky test_rolling_upgrade

    XMLWordPrintableJSON

Details

    Description

      The fix on CASSANDRA-17140 needs to be extended into other places as it seems it now fails only one in a billion but still we can fix that one.

      Regression
      
      dtest-upgrade.upgrade_tests.upgrade_through_versions_test.TestProtoV3Upgrade_AllVersions_RandomPartitioner_EndsAt_3_11_X_HEAD.test_rolling_upgrade (from Cassandra dtests)
      Failing for the past 1 build (Since
      #115 )
      Took 10 min.
      Failed 1 times in the last 9 runs. Flakiness: 12%, Stability: 88%
      Error Message
      
      RuntimeError: A subprocess has terminated early. Subprocess statuses: Process-1 (is_alive: True), Process-2 (is_alive: False), attempting to terminate remaining subprocesses now.
      
      Stacktrace
      
      self = <upgrade_tests.upgrade_through_versions_test.TestProtoV3Upgrade_AllVersions_RandomPartitioner_EndsAt_3_11_X_HEAD object at 0x7f4d313e4e50>
      
          @pytest.mark.timeout(3000)
          def test_rolling_upgrade(self):
              """
                  Test rolling upgrade of the cluster, so we have mixed versions part way through.
                  """
      >       self.upgrade_scenario(rolling=True)
      
      upgrade_tests/upgrade_through_versions_test.py:340: 
      _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
      upgrade_tests/upgrade_through_versions_test.py:417: in upgrade_scenario
          self._check_on_subprocs(self.fixture_dtest_setup.subprocs)
      _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
      
      self = <upgrade_tests.upgrade_through_versions_test.TestProtoV3Upgrade_AllVersions_RandomPartitioner_EndsAt_3_11_X_HEAD object at 0x7f4d313e4e50>
      subprocs = [<Process name='Process-1' pid=10867 parent=389 stopped exitcode=-SIGKILL daemon>, <Process name='Process-2' pid=10881 parent=389 stopped exitcode=1 daemon>]
      
          def _check_on_subprocs(self, subprocs):
              """
                  Check on given subprocesses.
          
                  If any are not alive, we'll go ahead and terminate any remaining alive subprocesses since this test is going to fail.
                  """
              subproc_statuses = [s.is_alive() for s in subprocs]
              if not all(subproc_statuses):
                  message = "A subprocess has terminated early. Subprocess statuses: "
                  for s in subprocs:
                      message += "{name} (is_alive: {aliveness}), ".format(name=s.name, aliveness=s.is_alive())
                  message += "attempting to terminate remaining subprocesses now."
                  self._terminate_subprocs()
      >           raise RuntimeError(message)
      E           RuntimeError: A subprocess has terminated early. Subprocess statuses: Process-1 (is_alive: True), Process-2 (is_alive: False), attempting to terminate remaining subprocesses now.
      
      upgrade_tests/upgrade_through_versions_test.py:475: RuntimeError
      

      Attachments

        Issue Links

          Activity

            People

              bereng Berenguer Blasi
              bereng Berenguer Blasi
              Berenguer Blasi
              Benjamin Lerer
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: