[KAFKA-4124] Handle disk failures gracefully - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

Currently when a disk goes down, the broker also goes down with it. This causes too much reshuffle of data over the network to replace the broker. Make the broker resilient to disk failure.

The broker can detect a disk failure, mark it bad and then re-replicate the under replicated data in all other available disks in the node. If the bad disk is replaced with new one, the broker can rebalance the data among all other disks it has. The broker can also tolerate upto n disk failures.

Attachments

Issue Links

is duplicated by

KAFKA-4763 Handle disk failure for JBOD (KIP-112)

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Gokul

Votes:: 1 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 05/Sep/16 11:31

Updated:: 12/Apr/18 14:29

Resolved:: 12/Apr/18 14:29