[HDFS-1539] prevent data loss when a cluster suffers a power loss - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.23.0, 1.1.1
Component/s: datanode, hdfs-client, namenode
Labels:
None

Hadoop Flags:

Reviewed

Description

we have seen an instance where a external outage caused many datanodes to reboot at around the same time. This resulted in many corrupted blocks. These were recently written blocks; the current implementation of HDFS Datanodes do not sync the data of a block file when the block is closed.

1. Have a cluster-wide config setting that causes the datanode to sync a block file when a block is finalized.
2. Introduce a new parameter to the FileSystem.create() to trigger the new behaviour, i.e. cause the datanode to sync a block-file when it is finalized.
3. Implement the FSDataOutputStream.hsync() to cause all data written to the specified file to be written to stable storage.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

syncOnClose1.txt
20/Dec/10 01:28
6 kB
Dhruba Borthakur
syncOnClose2_b-1.txt
10/Nov/12 01:41
6 kB
Tsz-wo Sze
syncOnClose2.txt
27/Dec/10 22:46
6 kB
Dhruba Borthakur

Issue Links

relates to

HDFS-5042 Completed files lost after power failure

Resolved

Activity

People

Assignee:: Dhruba Borthakur

Reporter:: Dhruba Borthakur

Votes:: 0 Vote for this issue

Watchers:: 21 Start watching this issue

Dates

Created:: 16/Dec/10 00:06

Updated:: 02/Oct/14 17:55

Resolved:: 04/Jan/11 19:11