[CASSANDRA-6134] Asynchronous batchlog replay - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Low
Resolution: Fixed
Fix Version/s: 2.1 rc1
Component/s: None
Labels:
None

Description

As we discussed earlier in ~~CASSANDRA-6079~~ this is the new BatchManager.

It stores batch records in

CREATE TABLE batchlog (
  id_partition int,
  id timeuuid,
  data blob,
  PRIMARY KEY (id_partition, id)
) WITH COMPACT STORAGE AND
  CLUSTERING ORDER BY (id DESC)

where id_partition is minute-since-epoch of id uuid.
So when it scans for batches to replay ot scans within a single partition for a slice of ids since last processed date till now minus write timeout.
So no full batchlog CF scan and lot of randrom reads are made on normal cycle.

Other improvements:
1. It runs every 1/2 of write timeout and replays all batches written within 0.9 * write timeout from now. This way we ensure, that batched updates will be replayed to th moment client times out from coordinator.
2. It submits all mutations from single batch in parallel (Like StorageProxy do). Old implementation played them one-by-one, so client can see half applied batches in CF for a long time (depending on size of batch).
3. It fixes a subtle racing bug with incorrect hint ttl calculation

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

BatchlogManager.txt
02/Oct/13 15:29
28 kB
Oleg Anastasyev
6134-async.txt
07/Mar/14 10:05
17 kB
Oleg Anastasyev
6134-cleanup.txt
29/Mar/14 17:46
15 kB
Aleksey Yeschenko

Activity

People

Assignee:: Oleg Anastasyev

Reporter:: Oleg Anastasyev

Authors:: Oleg Anastasyev

Reviewers:: Aleksey Yeschenko

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 02/Oct/13 15:28

Updated:: 16/Apr/19 09:32

Resolved:: 14/May/14 22:18