Thanks Eli for explaining on the usecase. I briefly talked to Koji about this Jira.
Some more thoughts on this.
1. If fs.data.dir.critical is not defined, then implementation should fall back to existing tolerate a volume failure case.
2. If fs.data.dir.critical is defined, then fail-fast and fail-stop as you described.
Case 2 you mentioned is interesting too. Today, datanode is not aware of this case since it may not be part of the dfs.data.dir config.
I see that the key benefit of having this Jira is fail-fast. Meaning, if any of the critical volume(s) fail, we let the namenode know immediately and datanode will exit. So the replication will be taken care and cluster/datanode restarts might see less issues with missing blocks.
W.r.t case 2 you mentioned, there are the possibilites of failures, right?
1. Data is stored on root partition disk say /root/hadoop (binaries,conf,log), /root/data0
Failures: /root readonly filesystem or failure, /root/data0 readonly filesystem or failure, complete disk0 failure.
2. Data NOT stored on root partition disk, /root(disk1), /data0(disk2)
Failures: /root readonly filesystem or failure, /data0(disk2) readonly filesystem or failure.
3. Swap partition failure
How will this be detected?
I am wondering, if datanode should worry about all these issues regarding its health or should a
configuration like in TaskTracker for health check script which will let Datanode about the disk issues,
network issues etc is a better option?