Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-10822

SSTable data loss when upgrading with row tombstone present

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Urgent
    • Resolution: Fixed
    • 3.0.2, 3.1.1, 3.2
    • None
    • None
    • Critical

    Description

      I ran into an issue when upgrading between 2.1.11 to 3.0.0 (and also cassandra-3.0 branch) where subsequent rows were lost within a partition where there is a row tombstone present.

      Here's a scenario that reproduces the issue.

      Using ccm create a single node cluster at 2.1.11:

      ccm create -n 1 -v 2.1.11 -s financial

      Run the following queries to create schema, populate some data and then delete some data for november:

      drop keyspace if exists financial;
      
      create keyspace if not exists financial with replication = {'class': 'SimpleStrategy', 'replication_factor' : 1 };
      
      create table if not exists financial.symbol_history (
        symbol text,
        name text static,
        year int,
        month int,
        day int,
        volume bigint,
        close double,
        open double,
        low double,
        high double,
        primary key((symbol, year), month, day)
      ) with CLUSTERING ORDER BY (month desc, day desc);
      
      insert into financial.symbol_history (symbol, name, year, month, day, volume) values ('CORP', 'MegaCorp', 2004, 1, 1, 100);
      insert into financial.symbol_history (symbol, name, year, month, day, volume) values ('CORP', 'MegaCorp', 2004, 2, 1, 100);
      insert into financial.symbol_history (symbol, name, year, month, day, volume) values ('CORP', 'MegaCorp', 2004, 3, 1, 100);
      insert into financial.symbol_history (symbol, name, year, month, day, volume) values ('CORP', 'MegaCorp', 2004, 4, 1, 100);
      insert into financial.symbol_history (symbol, name, year, month, day, volume) values ('CORP', 'MegaCorp', 2004, 5, 1, 100);
      insert into financial.symbol_history (symbol, name, year, month, day, volume) values ('CORP', 'MegaCorp', 2004, 6, 1, 100);
      insert into financial.symbol_history (symbol, name, year, month, day, volume) values ('CORP', 'MegaCorp', 2004, 7, 1, 100);
      insert into financial.symbol_history (symbol, name, year, month, day, volume) values ('CORP', 'MegaCorp', 2004, 8, 1, 100);
      insert into financial.symbol_history (symbol, name, year, month, day, volume) values ('CORP', 'MegaCorp', 2004, 9, 1, 100);
      insert into financial.symbol_history (symbol, name, year, month, day, volume) values ('CORP', 'MegaCorp', 2004, 10, 1, 100);
      insert into financial.symbol_history (symbol, name, year, month, day, volume) values ('CORP', 'MegaCorp', 2004, 11, 1, 100);
      insert into financial.symbol_history (symbol, name, year, month, day, volume) values ('CORP', 'MegaCorp', 2004, 12, 1, 100);
      
      delete from financial.symbol_history where symbol='CORP' and year = 2004 and month=11;
      

      Flush and run sstable2json on the sole Data.db file:

      ccm node1 flush
      sstable2json /path/to/file.db
      

      The output should look like the following:

      [
      {"key": "CORP:2004",
       "cells": [["::name","MegaCorp",1449457517033030],
                 ["12:1:","",1449457517033030],
                 ["12:1:volume","100",1449457517033030],
                 ["11:_","11:!",1449457564983269,"t",1449457564],
                 ["10:1:","",1449457516313738],
                 ["10:1:volume","100",1449457516313738],
                 ["9:1:","",1449457516310205],
                 ["9:1:volume","100",1449457516310205],
                 ["8:1:","",1449457516235664],
                 ["8:1:volume","100",1449457516235664],
                 ["7:1:","",1449457516233535],
                 ["7:1:volume","100",1449457516233535],
                 ["6:1:","",1449457516231458],
                 ["6:1:volume","100",1449457516231458],
                 ["5:1:","",1449457516228307],
                 ["5:1:volume","100",1449457516228307],
                 ["4:1:","",1449457516225415],
                 ["4:1:volume","100",1449457516225415],
                 ["3:1:","",1449457516222811],
                 ["3:1:volume","100",1449457516222811],
                 ["2:1:","",1449457516220301],
                 ["2:1:volume","100",1449457516220301],
                 ["1:1:","",1449457516210758],
                 ["1:1:volume","100",1449457516210758]]}
      ]
      

      Prepare for upgrade

      ccm node1 nodetool snapshot financial
      ccm node1 nodetool drain
      ccm node1 stop
      

      Upgrade to cassandra-3.0 and start the node

      ccm node1 setdir -v git:cassandra-3.0
      ccm node1 start
      

      Run command in cqlsh and observe only 1 row is returned! It appears that all data following november is gone.

      cqlsh> select * from financial.symbol_history;
      
       symbol | year | month | day | name     | close | high | low  | open | volume
      --------+------+-------+-----+----------+-------+------+------+------+--------
         CORP | 2004 |    12 |   1 | MegaCorp |  null | null | null | null |    100
      

      Upgrade sstables and query again and you'll observe the same problem.

      ccm node1 nodetool upgradesstables financial
      

      I modified the 2.2 version of sstable2json so that it works with 3.0 (couldn't help myself ), and observed 2 RangeTombstoneBoundMarker occurrences for 1 delete and the rest of the data missing.

      [
      {
       "key": "CORP:2004",
       "static": {
        "cells": {
          ["name","MegaCorp",1449457517033030]
        }
       },
       "rows": [
        {
         "clustering": {"month": "12", "day": "1"},
         "cells": {
           ["volume","100",1449457517033030]
         }
        },
        {
         "tombstone": ["11:*",1449457564983269,"t",1449457564]
        },
        {
         "tombstone": ["11:*",1449457564983269,"t",1449457564]
        }
       ]
      }
      ]
      

      I'm not sure why this is happening, but I should point out that I'm using static columns here and that I'm using reverse order for my clustering, so maybe that makes a difference. I'll try without static columns / regular ordering to see if that makes a difference and update the ticket.

      Attachments

        Activity

          People

            blambov Branimir Lambov
            andrew.tolbert Andy Tolbert
            Branimir Lambov
            Sylvain Lebresne
            Votes:
            0 Vote for this issue
            Watchers:
            16 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: