Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Incomplete
-
None
-
None
-
None
-
None
Description
When checkFileSystem() fails then the region server should wait for sometime before aborting. By default, the timeout can be same as zookeeper session timeout.
When say a rack switch reboots or fails for a few minutes, and all the traffic to the region server dies ... then we don't want the region servers to unnecessarily kill themselves when ongoing compactions or flushes fail.