Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.4.2
-
None
Description
We came across a situation on one of our management nodes related to this:
https://bugzilla.redhat.com/show_bug.cgi?id=962755
The management node had an old NFS share mounted from a storage unit which was removed from service. Attempts to unmount the share were not successful.
Under fairly rare circumstances, a vcld process will call lsof on the management node in order to determine which other vcld process is preventing it from obtaining a semaphore. This vcld process hung indefinitely due to the unavailable NFS share and the issue described in the link above.
There is currently no timeout mechanism built into the code which executes commands locally on the management node. It would be beneficial to add one and specify a timeout on commands which may hang such as lsof.