Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-15072

Incomplete range results during 2.X -> 3.11.4 upgrade

    XMLWordPrintableJSON

Details

    • Correctness - Transient Incorrect Response
    • Normal
    • Normal
    • User Report
    • Hide

      circleci / in jvm upgrade dtests

      Show
      circleci / in jvm upgrade dtests

    Description

      Hello

      During an upgrade from 2.1.17 to 3.11.4, our application starting getting back incomplete results for range queries. When all nodes were upgraded (before upgrading sstables), we stopped getting incomplete results. I was able to reproduce it and listed steps below. It seems to require the random partitioner and compact storage to reproduce reliably. It also reproduces coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old node is your coordinator and it has to talk to an upgraded replica.

      ccm create test -v 2.1.17 -n 3
      ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner'
      ccm node1 updateconf 'initial_token: 0'
      ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242'
      ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484'
      ccm start
      
      ccm node1 cqlsh <<SCHEMA
      CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', 'replication_factor': 3};
      CREATE COLUMNFAMILY test.test (
        id text,
        foo text,
        bar text,
        PRIMARY KEY (id)
      ) WITH COMPACT STORAGE;
      CONSISTENCY QUORUM;
      INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there');
      INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there');
      SCHEMA
      
      ccm node1 stop
      ccm node1 setdir -v 3.11.4
      ccm node1 start
      
      ccm node2 stop
      ccm node2 setdir -v 3.11.4
      ccm node2 start
      
      # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to
      # allow for simpler test setup)
      cqlsh 127.0.0.3 <<QUERY
      CONSISTENCY QUORUM;
      PAGING 2;
      select * from test.test;
      QUERY
      

      This results in:

      Page size: 2
      
       id | bar   | foo
      ----+-------+-----
        2 | there |  hi
      
      (1 rows)
      

      Running it against the upgraded node (node1):

      Page size: 2
      
       id | bar   | foo
      ----+-------+-----
        2 | there |  hi
        1 | there |  hi
      
      (2 rows)
      

      Attachments

        1. eriksw-repro.sh
          2 kB
          Erik Swanson

        Activity

          People

            bdeggleston Blake Eggleston
            muir Muir Manders
            Blake Eggleston
            Sam Tunnicliffe
            Votes:
            1 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: