Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Won't Fix
-
0.95.2
-
None
-
None
-
all
Description
This comes from a hdfs bug, fixed in some hdfs versions. I haven't found the hdfs jira for this.
Context: HBase Write Ahead Log features. This is using hdfs append. If the node crashes, the file that was written is read by other processes to replay the action.
- So we have in hdfs one (dead) process writing with another process reading.
- But, despite the call to syncFs, we don't always see the data when we have a dead node. It seems to be because the call in DFSClient#updateBlockInfo ignores the ipc errors and set the length to 0.
- So we may miss all the writes to the last block if we try to connect to the dead DN.
hdfs 1.0.3, branch-1 or branch-1-win: we have the issue
http://svn.apache.org/viewvc/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java?revision=1359853&view=markup
hdfs branch-2 or trunk: we should not have the issue (but not tested)
http://svn.apache.org/viewvc/hadoop/common/branches/branch-2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java?view=markup
The attached test will fail ~50 of the time.
Attachments
Attachments
Issue Links
- is related to
-
HDFS-3701 HDFS may miss the final block when reading a file opened for writing if one of the datanode is dead
- Closed
- relates to
-
HBASE-6435 Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes
- Closed
-
HBASE-5843 Improve HBase MTTR - Mean Time To Recover
- Closed