Description
We launch EC2 instances in spark-ec2 and then immediately tag them in a separate boto call. Sometimes, EC2 doesn't get enough time to propagate information about the just-launched instances, so when we go to tag them we get a server that doesn't know about them yet.
This yields the following type of error:
Launching instances... Launched 1 slaves in us-east-1b, regid = r-cf780321 Launched master in us-east-1b, regid = r-da7e0534 Traceback (most recent call last): File "./ec2/spark_ec2.py", line 1284, in <module> main() File "./ec2/spark_ec2.py", line 1276, in main real_main() File "./ec2/spark_ec2.py", line 1122, in real_main (master_nodes, slave_nodes) = launch_cluster(conn, opts, cluster_name) File "./ec2/spark_ec2.py", line 646, in launch_cluster value='{cn}-master-{iid}'.format(cn=cluster_name, iid=master.id)) File ".../spark/ec2/lib/boto-2.34.0/boto/ec2/ec2object.py", line 80, in add_tag self.add_tags({key: value}, dry_run) File ".../spark/ec2/lib/boto-2.34.0/boto/ec2/ec2object.py", line 97, in add_tags dry_run=dry_run File ".../spark/ec2/lib/boto-2.34.0/boto/ec2/connection.py", line 4202, in create_tags return self.get_status('CreateTags', params, verb='POST') File ".../spark/ec2/lib/boto-2.34.0/boto/connection.py", line 1223, in get_status raise self.ResponseError(response.status, response.reason, body) boto.exception.EC2ResponseError: EC2ResponseError: 400 Bad Request <?xml version="1.0" encoding="UTF-8"?> <Response><Errors><Error><Code>InvalidInstanceID.NotFound</Code><Message>The instance ID 'i-585219a6' does not exist</Message></Error></Errors><RequestID>b9f1ad6e-59b9-47fd-a693-527be1f779eb</RequestID></Response>
The solution is to tag the instances in the same call that launches them, or less desirably, tag the instances after some short wait.
Attachments
Issue Links
- is duplicated by
-
SPARK-7900 Reduce number of tagging calls in spark-ec2
- Resolved
- is related to
-
SPARK-3332 Tagging is not atomic with launching instances on EC2
- Closed
- links to