Description
I tried to launch an EC2 cluster using Patrick's new version of the EC2 script, running this command
./spark-ec2 -i ~/.ssh/id_rsa -s 10 -t m1.large --spot-price 0.25 --region us-west-1 launch test_cluster
This launched a cluster, but ssh failed because I didn't configure things properly
Setting up security groups... Searching for existing cluster test_cluster... Spark AMI: ami-61ffd024 Launching instances... Requesting 10 slaves as spot instances with price $0.250 Waiting for spot instances to be granted... 0 of 10 slaves granted, waiting longer ... All 10 slaves granted Launched master in us-west-1a, regid = r-790f2320 Waiting for instances to start up... Waiting 120 more seconds... Copying SSH key /Users/joshrosen/.ssh/id_rsa to master... Warning: Permanently added 'ec2-54-241-179-230.us-west-1.compute.amazonaws.com,54.241.179.230' (RSA) to the list of known hosts. Permission denied (publickey).
When I tried to shut this cluster down, the EC2 script claims that it couldn't find a cluster. I see the same message when using the 0.7 version of spark-ec2.
If I view the EC2 administration console, it tells me that I have a number of machines running in us-west-1 with the right security groups.
Although the script finds the active instance requests, it can't find any security group names for those requests, so it fails to identify those machines as belonging to our spark-ec2 cluster.
I think the culprit is the line
group_names = [g.name for g in res.groups]
in get_existing_cluster().
If I replace this with
group_names = list(set(g.name for g in i.groups for i in res.instances))
then it finds the group names and detects the cluster.
Does anyone know what's going on here? I'm not familiar enough with Boto to explain why the Instance would have groups attribute while its Request's groups are empty.