[HDFS-909] Race condition between rollEditLog or rollFSImage ant FSEditsLog.write operations corrupts edits log - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Blocker
Resolution: Fixed
Affects Version/s: 0.20.1, 0.20.2, 0.21.0, 0.22.0
Fix Version/s: 0.20.3, 0.21.0
Component/s: namenode
Labels:
None
Environment:

CentOS

Hadoop Flags:

Reviewed

Description

closing the edits log file can race with write to edits log file operation resulting in OP_INVALID end-of-file marker being initially overwritten by the concurrent (in setReadyToFlush) threads and then removed twice from the buffer, losing a good byte from edits log.

Example:

FSNameSystem.rollEditLog() -> FSEditLog.divertFileStreams() -> FSEditLog.closeStream() -> EditLogOutputStream.setReadyToFlush()
FSNameSystem.rollEditLog() -> FSEditLog.divertFileStreams() -> FSEditLog.closeStream() -> EditLogOutputStream.flush() -> EditLogFileOutputStream.flushAndSync()
OR
FSNameSystem.rollFSImage() -> FSIMage.rollFSImage() -> FSEditLog.purgeEditLog() -> FSEditLog.revertFileStreams() -> FSEditLog.closeStream() ->EditLogOutputStream.setReadyToFlush() 
FSNameSystem.rollFSImage() -> FSIMage.rollFSImage() -> FSEditLog.purgeEditLog() -> FSEditLog.revertFileStreams() -> FSEditLog.closeStream() ->EditLogOutputStream.flush() -> EditLogFileOutputStream.flushAndSync()

VERSUS

FSNameSystem.completeFile -> FSEditLog.logSync() -> EditLogOutputStream.setReadyToFlush()
FSNameSystem.completeFile -> FSEditLog.logSync() -> EditLogOutputStream.flush() -> EditLogFileOutputStream.flushAndSync()
OR 
Any FSEditLog.write

Access on the edits flush operations is synchronized only in the FSEdits.logSync() method level. However at a lower level access to EditsLogOutputStream setReadyToFlush(), flush() or flushAndSync() is NOT synchronized. These can be called from concurrent threads like in the example above

So if a rollEditLog or rollFSIMage is happening at the same time with a write operation it can race for EditLogFileOutputStream.setReadyToFlush that will overwrite the the last byte (normally the FSEditsLog.OP_INVALID which is the "end-of-file marker") and then remove it twice (from each thread) in flushAndSync()! Hence there will be a valid byte missing from the edits log that leads to a SecondaryNameNode silent failure and a full HDFS failure upon cluster restart.

We got to this point after investigating a corrupted edits file that made HDFS unable to start with

namenode.log

java.io.IOException: Incorrect data format. logVersion is -20 but writables.length is 768. 
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadEditRecords(FSEditLog.java:450

EDIT: moved the logs to a comment to make this readable

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

hdfs-909-unittest.txt
29/Jan/10 01:00
6 kB
Todd Lipcon
hdfs-909.txt
29/Jan/10 01:54
11 kB
Todd Lipcon
hdfs-909.txt
02/Feb/10 04:00
12 kB
Todd Lipcon
hdfs-909.txt
02/Feb/10 22:33
12 kB
Todd Lipcon
hdfs-909.txt
16/Feb/10 02:01
25 kB
Todd Lipcon
hdfs-909.txt
06/Apr/10 21:49
25 kB
Todd Lipcon
ASF.LICENSE.NOT.GRANTED--hdfs-909.txt
13/Apr/10 23:59
25 kB
Todd Lipcon
ASF.LICENSE.NOT.GRANTED--hdfs-909-branch-0.20.txt
17/Apr/10 01:31
25 kB
Todd Lipcon
hdfs-909-branch-0.21.txt
19/Apr/10 23:24
24 kB
Konstantin Shvachko
hdfs-909-ammendation.txt
20/Apr/10 21:28
7 kB
Todd Lipcon
hdfs-909-branch-0.20.txt
20/Apr/10 21:54
25 kB
Todd Lipcon
hdfs-909-unified.txt
21/Apr/10 00:12
26 kB
Todd Lipcon
hdfs-909-branch-0.20.txt
21/Apr/10 00:12
25 kB
Todd Lipcon
hdfs-909-branch-0.21.txt
21/Apr/10 03:06
25 kB
Konstantin Shvachko

Issue Links

is blocked by

HADOOP-6554 DelegationTokenSecretManager lifecycle is inconsistent

Resolved

is related to

HDFS-980 Convert FSNamesystem lock to ReadWriteLock

Resolved

relates to

HDFS-955 FSImage.saveFSImage can lose edits

Resolved

HDFS-956 Improper synchronization in some FSNamesystem methods

Resolved

HDFS-988 saveNamespace race can corrupt the edits log

Closed

Activity

People

Assignee:: Todd Lipcon

Reporter:: Cosmin Lehene

Votes:: 0 Vote for this issue

Watchers:: 14 Start watching this issue

Dates

Created:: 20/Jan/10 18:04

Updated:: 24/Aug/10 20:51

Resolved:: 21/Apr/10 03:09