Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
2.3.2
-
None
Description
xCAT commands run from the management node may occasionally display "Error: Timeout". This usually occurs when multiple nodes are being loaded and several xCAT commands are issued concurrently.
Making additional attempts to run the command again usually results in the command succeeding. There are loops to detect errors and make multiple attempts in the current xCAT code but sometimes these are not sufficient. It would be beneficial to differentiate timeout errors from other errors. If a timeout error occurs, additional attempts should be made. The current loops are sufficient for non-timeout errors. If these occur multiple times in a row it usually means there is a problem which won't be fixed by trying over and over again. If timeout errors are encountered, more leeway should be given.
This will help prevent some failed new and reload reservations.