Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
0.20-append
-
None
-
None
Description
- Summary: when appending to partial block, if is possible that
retrial when facing an exception fails due to a checksum mismatch.
Append operation is not atomic (either complete or fail completely).
- Setup:
+ # available datanodes = 2
+# disks / datanode = 1
+ # failures = 1
+ failure type = bad disk
+ When/where failure happens = (see below)
- Details:
Client writes 16 bytes to dn1 and dn2. Write completes. So far so good.
The meta file now contains: 7 bytes header + 4 byte checksum (CK1 -
checksum for 16 byte) Client then appends 16 bytes more, and let assume there is an
exception at BlockReceiver.receivePacket() at dn2. So the client knows dn2
is bad. BUT, the append at dn1 is complete (i.e the data portion and checksum portion
has been made to disk to the corresponding block file and meta file), meaning that the
checksum file at dn1 now contains 7 bytes header + 4 byte checksum (CK2 - this is
checksum for 32 byte data). Because dn2 has an exception, client calls recoverBlock and
starts append again to dn1. dn1 receives 16 byte data, it verifies if the pre-computed
crc (CK2) matches what we recalculate just now (CK1), which obviously does not match.
Hence an exception and retrial fails.
- a similar bug has been reported at
https://issues.apache.org/jira/browse/HDFS-679
but here, it manifests in different context.
This bug was found by our Failure Testing Service framework:
http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
For questions, please email us: Thanh Do (thanhdo@cs.wisc.edu) and
Haryadi Gunawi (haryadi@eecs.berkeley.edu)