[ASTERIXDB-2666] One-Phase Log Replay Approach - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Wish
Status: Open
Priority: Minor
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: STO - Storage, TX - Transactions
Labels:
None

Description

AsterixDB currently uses a classical two-phase log replay approach during recovery by first identifying committed writes and then applying these commit writes to LSM-trees. This is a stardard approach for general-purpose transaction processing systems, but for AsterixDB, we can design something better.

AsterixDB uses a record-level transaction model where each write is committed as soon as possible by "entity commit". To exploit this property, we can design a one-phase log replay approach as follows:

Start from the log head based on the low watermark LSN
Whenever we see an update log record, store that log record in memory (for each job)
Whenever we see an entity commit or abort record, redo the corresponding update log record immediately and remove it from memory

The key property here is that the window between an update log record and a commit log record is very short - we commit on a frame basis. Thus, this will speed up the recovery process by only using one log read pass and avoiding store all entity commits in memory. We only need a small amount of memory, based on the window between updates and commits, during the recovery proces.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Chen Luo

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 31/Oct/19 17:10

Updated:: 31/Oct/19 17:10