[CASSANDRA-6696] Partition sstables by token range - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Normal
Resolution: Fixed
Fix Version/s: 3.2, 3.3
Component/s: Local/Compaction
Labels:

Description

In JBOD, when someone gets a bad drive, the bad drive is replaced with a new empty one and repair is run.
This can cause deleted data to come back in some cases. Also this is true for corrupt stables in which we delete the corrupt stable and run repair.
Here is an example:
Say we have 3 nodes A,B and C and RF=3 and GC grace=10days.
row=sankalp col=sankalp is written 20 days back and successfully went to all three nodes.
Then a delete/tombstone was written successfully for the same row column 15 days back.
Since this tombstone is more than gc grace, it got compacted in Nodes A and B since it got compacted with the actual data. So there is no trace of this row column in node A and B.
Now in node C, say the original data is in drive1 and tombstone is in drive2. Compaction has not yet reclaimed the data and tombstone.
Drive2 becomes corrupt and was replaced with new empty drive.
Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp has come back to life.
Now after replacing the drive we run repair. This data will be propagated to all nodes.

Note: This is still a problem even if we run repair every gc grace.

Attachments

Issue Links

is depended upon by

CASSANDRA-9363 Expose vnode to directory assignment

Resolved

is duplicated by

CASSANDRA-8866 PartitionedCompactionStrategy

Resolved

CASSANDRA-10419 Make JBOD compaction and flushing more robust

Resolved

is related to

CASSANDRA-14372 data_file_directories config - update documentation in cassandra.yaml

Resolved

is required by

CASSANDRA-10540 RangeAwareCompaction

Open

relates to

CASSANDRA-11765 dtest failure in upgrade_tests.upgrade_through_versions_test.ProtoV3Upgrade_AllVersions_Skips_3_0_x_EndsAt_Trunk_HEAD.rolling_upgrade_test

Resolved

CASSANDRA-7551 improve 2.1 flush defaults

Resolved

CASSANDRA-4784 Create separate sstables for each token range handled by a node

Resolved

requires

CASSANDRA-7032 Improve vnode allocation

Resolved

supercedes

CASSANDRA-8868 JBOD Aware Size Tiered Compaction Strategy

Resolved

(3 relates to, 1 requires, 1 supercedes)

Activity

People

Assignee:: Marcus Eriksson

Reporter:: Sankalp Kohli

Authors:: Marcus Eriksson

Reviewers:: Carl Yeksigian

Tester:: Ryan McGuire

Votes:: 6 Vote for this issue

Watchers:: 37 Start watching this issue

Dates

Created:: 12/Feb/14 16:17

Updated:: 16/Apr/19 09:31

Resolved:: 05/Jan/16 15:57