Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-17685

Test failure: transient_replication_test.py::TestTransientReplicationRepairStreamEntireSSTable::test_optimized_primary_range_repair

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Triage Needed
    • Normal
    • Resolution: Unresolved
    • 4.1.x
    • Test/dtest/python
    • None
    • Correctness - Test Failure
    • All
    • None

    Description

      The Python dtest transient_replication_test.py::TestTransientReplicationRepairStreamEntireSSTable::test_optimized_primary_range_repair is flaky at least in cassandra-4.1, with a flakiness < 1%.

      I haven't seen the failure on Jenkins but on this CircleCI run for CASSANDRA-17458.

      The failure can also be reproduced in the multiplexer, with 5 failures in 5000 iterations:

      self = <transient_replication_test.TestTransientReplicationRepairStreamEntireSSTable object at 0x7f87951c77b8>
      
          @pytest.mark.no_vnodes
          def test_optimized_primary_range_repair(self):
              """ optimized primary range incremental repair from full replica should remove data on node3 """
              self._test_speculative_write_repair_cycle(primary_range=True,
                                                        optimized_repair=True,
                                                        repair_coordinator=self.node1,
      >                                                 expect_node3_data=False)
      
      transient_replication_test.py:523: 
      _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
      transient_replication_test.py:473: in _test_speculative_write_repair_cycle
          with tm(self.node1) as tm1, tm(self.node2) as tm2, tm(self.node3) as tm3:
      transient_replication_test.py:62: in __enter__
          self.start()
      transient_replication_test.py:55: in start
          self.jmx.start()
      tools/jmxutils.py:187: in start
          subprocess.check_output(args, stderr=subprocess.STDOUT)
      /usr/lib/python3.6/subprocess.py:356: in check_output
          **kwargs).stdout
      _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
      
      input = None, timeout = None, check = True
      popenargs = (('/usr/lib/jvm/java-8-openjdk-amd64/bin/java', '-cp', '/usr/lib/jvm/java-8-openjdk-amd64/lib/tools.jar:/home/cassandr...t/tools/../lib/jolokia-jvm-1.6.2-agent.jar', 'org.jolokia.jvmagent.client.AgentLauncher', '--host', '127.0.0.1', ...),)
      kwargs = {'stderr': -2, 'stdout': -1}
      process = <subprocess.Popen object at 0x7f877e977208>
      stdout = b"Couldn't start agent for PID 11637\nPossible reason could be that port '8778' is already occupied.\nPlease check the standard output of the target process for a detailed error message.\n"
      stderr = None, retcode = 1
      
          def run(*popenargs, input=None, timeout=None, check=False, **kwargs):
              """Run command with arguments and return a CompletedProcess instance.
          
              The returned instance will have attributes args, returncode, stdout and
              stderr. By default, stdout and stderr are not captured, and those attributes
              will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.
          
              If check is True and the exit code was non-zero, it raises a
              CalledProcessError. The CalledProcessError object will have the return code
              in the returncode attribute, and output & stderr attributes if those streams
              were captured.
          
              If timeout is given, and the process takes too long, a TimeoutExpired
              exception will be raised.
          
              There is an optional argument "input", allowing you to
              pass a string to the subprocess's stdin.  If you use this argument
              you may not also use the Popen constructor's "stdin" argument, as
              it will be used internally.
          
              The other arguments are the same as for the Popen constructor.
          
              If universal_newlines=True is passed, the "input" argument must be a
              string and stdout/stderr in the returned object will be strings rather than
              bytes.
              """
              if input is not None:
                  if 'stdin' in kwargs:
                      raise ValueError('stdin and input arguments may not both be used.')
                  kwargs['stdin'] = PIPE
          
              with Popen(*popenargs, **kwargs) as process:
                  try:
                      stdout, stderr = process.communicate(input, timeout=timeout)
                  except TimeoutExpired:
                      process.kill()
                      stdout, stderr = process.communicate()
                      raise TimeoutExpired(process.args, timeout, output=stdout,
                                           stderr=stderr)
                  except:
                      process.kill()
                      process.wait()
                      raise
                  retcode = process.poll()
                  if check and retcode:
                      raise CalledProcessError(retcode, process.args,
      >                                        output=stdout, stderr=stderr)
      E               subprocess.CalledProcessError: Command '('/usr/lib/jvm/java-8-openjdk-amd64/bin/java', '-cp', '/usr/lib/jvm/java-8-openjdk-amd64/lib/tools.jar:/home/cassandra/cassandra-dtest/tools/../lib/jolokia-jvm-1.6.2-agent.jar', 'org.jolokia.jvmagent.client.AgentLauncher', '--host', '127.0.0.1', 'start', '11637')' returned non-zero exit status 1.
      
      /usr/lib/python3.6/subprocess.py:438: CalledProcessError
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              adelapena Andres de la Peña
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: