[HDFS-16598] Fix DataNode FsDatasetImpl lock issue without GS checks. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.4.0
Fix Version/s: 3.4.0
Component/s: datanode
Labels:
- pull-request-available

Target Version/s:

3.4.0
Hadoop Flags:

Reviewed
External issue ID:
~~HDFS-16534~~

Description

org.apache.hadoop.hdfs.testPipelineRecoveryOnRestartFailure failed with the stack like:

java.io.IOException: All datanodes [DatanodeInfoWithStorage[127.0.0.1:57448,DS-1b5f7e33-a2bf-4edc-9122-a74c995a99f5,DISK]] are bad. Aborting...
	at org.apache.hadoop.hdfs.DataStreamer.handleBadDatanode(DataStreamer.java:1667)
	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1601)
	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1587)
	at org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1371)
	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:674)

After tracing the root cause, this bug was introduced by HDFS-16534. Because the block GS of client may be smaller than DN when pipeline recovery failed.

Attachments

Issue Links

links to

GitHub Pull Request #4366

Activity

People

Assignee:: ZanderXu

Reporter:: ZanderXu

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 28/May/22 01:23

Updated:: 12/Feb/24 06:42

Resolved:: 14/Jun/22 13:50

Time Tracking

Estimated:

Not Specified

Remaining:

Logged: