[KUDU-2359] tserver should allow starting with a small number of missing data dirs - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.8.0
Component/s: fs, tserver
Labels:
None

Target Version/s:

1.8.0

Description

Often when a disk fails, its mount point will not come back up when the server is restarted. Currently, Kudu will respond to this by failing to restart with an error like:
F0314 18:23:39.353916 112051 tablet_server_main.cc:80] Check failed: _s.ok() Bad status: Already present: FS layout already exists; not overwriting existing layout. See https://kudu.apache.org/releases/1.8.0-SNAPSHOT/docs/troubleshooting.html: unable to create file system roots: FSManager roots already exist: /data/1/kudu,/data/2/kudu,/data/3/kudu,/data/5/kudu,/data/6/kudu,/data/7/kudu,/data/8/kudu,/data/1/kudu-wal

However, this defeats some of the advantages of the "allow single disk failure" work. One could use the update_data_dirs tool to remove the missing disk, but you'd also need to persistently change the configuration of the daemon, which is hard to do with a consistent configuration management.

Attachments

Issue Links

is related to

KUDU-2372 Don't let kudu start up if any disks are mounted read-only

Open

Activity

People

Assignee:: Andrew Wong

Reporter:: Todd Lipcon

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 20/Mar/18 05:53

Updated:: 23/May/18 23:21

Resolved:: 23/May/18 23:21