Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-4983

Add sleep() before tagging EC2 instances to allow instance metadata to propagate

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.2.0
    • 1.2.2, 1.3.0
    • EC2

    Description

      We launch EC2 instances in spark-ec2 and then immediately tag them in a separate boto call. Sometimes, EC2 doesn't get enough time to propagate information about the just-launched instances, so when we go to tag them we get a server that doesn't know about them yet.

      This yields the following type of error:

      Launching instances...
      Launched 1 slaves in us-east-1b, regid = r-cf780321
      Launched master in us-east-1b, regid = r-da7e0534
      Traceback (most recent call last):
        File "./ec2/spark_ec2.py", line 1284, in <module>
          main()
        File "./ec2/spark_ec2.py", line 1276, in main
          real_main()
        File "./ec2/spark_ec2.py", line 1122, in real_main
          (master_nodes, slave_nodes) = launch_cluster(conn, opts, cluster_name)
        File "./ec2/spark_ec2.py", line 646, in launch_cluster
          value='{cn}-master-{iid}'.format(cn=cluster_name, iid=master.id))
        File ".../spark/ec2/lib/boto-2.34.0/boto/ec2/ec2object.py", line 80, in add_tag
          self.add_tags({key: value}, dry_run)
        File ".../spark/ec2/lib/boto-2.34.0/boto/ec2/ec2object.py", line 97, in add_tags
          dry_run=dry_run
        File ".../spark/ec2/lib/boto-2.34.0/boto/ec2/connection.py", line 4202, in create_tags
          return self.get_status('CreateTags', params, verb='POST')
        File ".../spark/ec2/lib/boto-2.34.0/boto/connection.py", line 1223, in get_status
          raise self.ResponseError(response.status, response.reason, body)
      boto.exception.EC2ResponseError: EC2ResponseError: 400 Bad Request
      <?xml version="1.0" encoding="UTF-8"?>
      <Response><Errors><Error><Code>InvalidInstanceID.NotFound</Code><Message>The instance ID 'i-585219a6' does not exist</Message></Error></Errors><RequestID>b9f1ad6e-59b9-47fd-a693-527be1f779eb</RequestID></Response>
      

      The solution is to tag the instances in the same call that launches them, or less desirably, tag the instances after some short wait.

      Attachments

        Issue Links

          Activity

            People

              gen Gen TANG
              nchammas Nicholas Chammas
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: