[CASSANDRA-16826] Fix flaky test consistent_bootstrap_test.py::TestBootstrapConsistency::test_consistent_reads_after_move - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Normal
Resolution: Fixed
Fix Version/s: 4.1-alpha1, 4.1-beta1, 4.1
Component/s: Test/dtest/python
Labels:
None

Bug Category:
Correctness - Test Failure
Severity:
Low
Complexity:
Low Hanging Fruit
Discovered By:
Unit Test
Platform:

All
Impacts:

None
Since Version:

NA
Source Control Link:

https://github.com/apache/cassandra-dtest/commit/230c66c9395fb339d08744d832279281928d3b9b
Test and Documentation Plan:

Hide

local testing

Show
local testing

Description

consistent_bootstrap_test.py::TestBootstrapConsistency::test_consistent_reads_after_move can be flaky from time to time on less powerful environments due to timeouts; this can be improved by having retries for queries

Here is the error I was seeing

self =  <consistent_bootstrap_test.TestBootstrapConsistency object at 0x7fad988790f0>
     @pytest.mark.no_vnodes
    def test_consistent_reads_after_move(self):
        logger.debug("Creating a ring")
        cluster = self.cluster
        cluster.set_configuration_options(values={'hinted_handoff_enabled': False,
                                                  'write_request_timeout_in_ms': 60000,
                                                  'read_request_timeout_in_ms': 60000,
                                                  'dynamic_snitch_badness_threshold': 0.0})
        cluster.set_batch_commitlog(enabled=True)
   
        cluster.populate(3, tokens=[0, 2**48, 2**62]).start()
        node1, node2, node3 = cluster.nodelist()
   
        logger.debug("Set to talk to node 2")
        n2session = self.patient_cql_connection(node2)
        create_ks(n2session, 'ks', 2)
        create_c1c2_table(self, n2session)
   
        logger.debug("Generating some data for all nodes")
        insert_c1c2(n2session, keys=list(range(10, 20)), consistency=ConsistencyLevel.ALL)
   
        node1.flush()
        logger.debug("Taking down node1")
        node1.stop(wait_other_notice=True)
   
        logger.debug("Writing data to node2")
        insert_c1c2(n2session, keys=list(range(30, 1000)), consistency=ConsistencyLevel.ONE)
        node2.flush()
   
        logger.debug("Restart node1")
        node1.start()
   
        logger.debug("Move token on node3")
        node3.move(2)
   
        logger.debug("Checking that no data was lost")
        for n in range(10, 20):
            query_c1c2(n2session, n, ConsistencyLevel.ALL)
   
        for n in range(30, 1000):
>            query_c1c2(n2session, n, ConsistencyLevel.ALL)
 consistent_bootstrap_test.py:55: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
tools/data.py:34: in query_c1c2
    rows = list(session.execute(query))
cassandra/cluster.py:2618: in cassandra.cluster.Session.execute
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
 >    ???
E   cassandra.OperationTimedOut: errors={'127.0.0.1:9042': 'Client request timeout. See Session.execute[_async](timeout)'}, last_host=127.0.0.1:9042
 cassandra/cluster.py:4894: OperationTimedOut

Attachments

Issue Links

links to

GH: trunk

Activity

People

Assignee:: David Capwell

Reporter:: David Capwell

Authors:: David Capwell

Reviewers:: Abe Ratnofsky, Ekaterina Dimitrova

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 03/Aug/21 20:23

Updated:: 05/Oct/22 22:22

Resolved:: 24/May/22 14:52