Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
None
-
None
-
None
Description
The current HDFS implementation has the limitation that it does not replicate the last partial block of a file when it is being written into until the file is closed. There are some long running applications (e.g. HBase) which writes transactions logs into HDFS. If datanode(s) in the write pipeline dies off, the application has no knowledge of it until all the datanode(s) fail and the application gets an IO error.
These applictions would benefit a lot if they can determine the number of live replicas of the current block to which it is writing data. For example, the application can decide that when one of the datanode in the write pipeline fails it will close the file and start writing to a new file.
Attachments
Attachments
Issue Links
- blocks
-
HDFS-826 Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline
- Closed