Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.21.0
-
None
-
None
Description
While investigating failures on HDFS-1602 it became apparent that once a namenode storage volume is pulled out NN becomes completely "sticky" until FSImage:processIOError: removing storage move the storage from the active set. During this time none of normal NN operations are possible (e.g. creating a directory on HDFS timeouts eventually).
In case of NFS this can be workaround'd with soft,intr,timeo,retrans settings. However, a better handling of the situation is apparently possible and needs to be implemented.
Attachments
Issue Links
- is related to
-
HDFS-1602 NameNode storage failed replica restoration is broken
- Closed