Uploaded image for project: 'VCL'
  1. VCL
  2. VCL-1072

VMware may incorrectly indicate power on failed



    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.5
    • vcld (backend)
    • None


      I have seen on occasion VM new or reload requests fail because the code detects that VMware failed to power the VM on. The first warning that appears in vcld.log looks like this:

      VIM_SSH.pm:_run_vim_cmd|293| ---- WARNING ----
      VIM_SSH.pm:_run_vim_cmd|293| 2017-08-05 12:02:29VIM_SSH.pm:_run_vim_cmd|293|attempt 1/5: command failed on VM host <VMware host>: vim-cmd vmsvc/power.on 369, exit status: 1, output:
      VIM_SSH.pm:_run_vim_cmd|293| Powering on VM:
      VIM_SSH.pm:_run_vim_cmd|293| Power on failed
      VIM_SSH.pm:_run_vim_cmd|293| ( 0) VIM_SSH.pm, _run_vim_cmd (line: 293)
      VIM_SSH.pm:_run_vim_cmd|293| (-1) VIM_SSH.pm, vm_power_on (line: 1224)
      VIM_SSH.pm:_run_vim_cmd|293| (-2) VMware.pm, power_on (line: 6861)
      VIM_SSH.pm:_run_vim_cmd|293| (-3) VMware.pm, load (line: 751)
      VIM_SSH.pm:_run_vim_cmd|293| (-4) new.pm, reload_image (line: 588)
      VIM_SSH.pm:_run_vim_cmd|293| (-5) new.pm, process (line: 275)
      VIM_SSH.pm:_run_vim_cmd|293| (-6) vcld, make_new_child (line: 601)
      VIM_SSH.pm:_run_vim_cmd|293| (-7) vcld, main (line: 353)

      The Power on failed message may be true in some cases. However, I've seen this returned when the VM, in fact, did power on when the first attempt is made.

      The output of the vim-cmd vmsvc/power.on command does not indicate that the VM is already powered on.

      blade1a2-7:/vmfs/volumes/5900b99c-beb1caa4-1e81-3440b5dc1a7c/vs1-21_4513-v3 $ vim-cmd vmsvc/power.on 369
      Powering on VM:
      Power on failed

      blade1a2-7:/vmfs/volumes/5900b99c-beb1caa4-1e81-3440b5dc1a7c/vs1-21_4513-v3 $ vim-cmd vmsvc/power.getstate 369
      Retrieved runtime info
      Powered on

      Regardless, the VIM_SSH.pm::_run_vim_cmd subroutine then makes 4 more attempts. All subsequent attempts fail because the VM is already powered on and the reload eventually fails:

      2017-08-05 12:02:29VIM_SSH.pm:_run_vim_cmd|203|sleeping 5 seconds before making attempt 2/5
      ... (4 additional power on attempts are made)
      VIM_SSH.pm:_run_vim_cmd|309| ---- WARNING ----
      VIM_SSH.pm:_run_vim_cmd|309| 2017-08-05 12:02:52VIM_SSH.pm:_run_vim_cmd|309|failed to run VIM command on VM host <VMware host>: 'vim-cmd vmsvc/power.on 369', made 5 attempts
      VMware.pm:load|752| ---- WARNING ----
      VMware.pm:load|752| 2017-08-05 12:02:52VMware.pm:load|752|failed to power on VM <VM> on VM host: <VMware host>
      new.pm:reload_image|596| ---- WARNING ----
      new.pm:reload_image|596| 2017-08-05 12:02:52new.pm:reload_image|596|<image> failed to load on <VM>, returning
      new.pm:process|325| ---- WARNING ----
      new.pm:process|325| 2017-08-05 12:02:52new.pm:process|325|failed to load <VM> with <image>
      State.pm:reservation_failed|461| ---- CRITICAL ----
      State.pm:reservation_failed|461| 2017-08-05 12:02:52State.pm:reservation_failed|461|reservation failed on <VM>: process failed after trying to load or make available

      At a minimum, the VMware.pm::power_on subroutine should check the power state of the VM after its call to the API module's vm_power_on subroutine fails. This would prevent the reservation to fail.

      A better solution would be for a single attempt to be made followed by a power status check if it indicates the power on operation failed. If the VM is indeed off, make additional power on attempts.




            arkurth Andrew Kurth
            arkurth Andrew Kurth
            0 Vote for this issue
            1 Start watching this issue