[KUDU-120] Don't need to block on COMMIT before sending write responses - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: M4
Fix Version/s: None
Component/s: consensus, perf, tablet
Labels:
None

Target Version/s:

M4.5
Code Review:
http://gerrit.ent.cloudera.com:8080/#/c/1831/

Description

Per extensive discussion on IRC this week, we determined the following:

currently we are waiting on the local peer to durably log a transaction's COMMIT record before we release locks and respond to the client
however, from the consensus point of view, this is unnecessary, by the intuition that any action that occurs only on a minority of nodes cannot be considered persistent

Right now, we're relying on it for a separate reason: if we were to release the locks for the applied mutations in memory, then we could get the following interleaving:

1. log REPLICATE for a write
2. apply the write to memory
3. flush the memory to disk
4. log the COMMIT to disk

If we were to crash between step 3 and 4, then the recovery code would try to replay the edit, not realizing that the edit was already made durable by virtue of the flush in step 3.

The solution is to add a step to the flush/compact code which does a "soft barrier" of sorts - wait to perform the flush until all of the transactions pertaining to data in that memory region have been COMMITted in the WAL.

Attachments

Activity

People

Assignee:: David Alves

Reporter:: Todd Lipcon

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 14/Feb/14 16:06

Updated:: 26/Feb/16 11:34

Resolved:: 25/Mar/15 13:43