Uploaded image for project: 'Libcloud'
  1. Libcloud
  2. LIBCLOUD-157

Deployment script retries are brain-dead

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.8.0
    • None
    • Core
    • None

    Description

      in common/base, NodeDriver._run_deployment_script has the following retry wrapper:

      tries = 0
      while tries < max_tries:
      try:
      node = task.run(node, ssh_client)
      except Exception:
      tries += 1
      if tries >= max_tries:
      raise LibcloudError(value='Failed after %d tries'
      % (max_tries), driver=self)
      else:
      ssh_client.close()
      return node

      The except Exception swallows all errors, making debugging very hard.

      Furthermore, max_tries is effectively hard-coded in deploy_node():

      self._run_deployment_script(task=kwargs['deploy'],
      node=node,
      ssh_client=ssh_client,
      max_tries=3)

      ... forcing people who want to control retries to spin their own deploy_node().

      Suggestions:

      • at a minimum, log or warn about the error that's caught in the retry loop
      • better yet, make the catch more fine-grained, so that errors that we know won't be retry-able will fail out immediately.
      • think about making the default number of max_tries 1
      • make max_tries controllable from deploy_node

      Attachments

        Activity

          People

            kami Tomaz Muraus
            mnot Mark Nottingham
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated: