[KUDU-635] Implement clean shutdown - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Trivial
Resolution: Unresolved
Affects Version/s: M5
Fix Version/s: None
Component/s: recovery
Labels:
- roadmap-candidate

Target Version/s:

Backlog

Description

Today, a Kudu node's "shutdown" routine is merely exiting abruptly upon receipt of a signal, be it SIGINT, SIGTERM, or (obviously) SIGKILL. Any in-memory state (like MRS or DRS) is lost, and on startup, the WAL must be replayed as part of bootstrap.

It's not hard to conceive of a cleaner shutdown routine.It'd probably be issued via RPC, and it would perform the following steps:

Quiesce the server so that future RPCs are dropped.
Abdicate quorum leadership.
Flush every MRS/DRS.
GC every WAL.
Exit gracefully (i.e. run through the TS/Master destructor).

Kudu is meant to recover in the event of a crash, so why bother with a clean shutdown? Why not make every shutdown an "abrupt" one? Well, a clean shutdown would take more time to run, but would also guarantee faster startup because there'd be less work to do during bootstrap. With a clean shutdown, time("work at shutdown") < time("work at startup"), and that would also help making Kudu rolling restarts more efficient. A similar tack was recently taken in HDFS for the same reason.

The easy part (step #5 from the above list) was recently implemented here.

Attachments

Issue Links

is related to

KUDU-2054 Rolling Restart and Upgrade

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Adar Dembo

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 04/Mar/15 14:45

Updated:: 06/Aug/19 23:52