[SPARK-3332] Tagging is not atomic with launching instances on EC2 - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.1.1, 1.2.0
Component/s: EC2
Labels:
None

Target Version/s:

1.1.1, 1.2.0

Description

The implementation for ~~SPARK-2333~~ changed the machine membership mechanism from security groups to tags.

This is a fundamentally flawed strategy as there aren't guarantees at all the machines will have a tag (even with a retry mechanism).

For instance, if the script is killed after launching the instances but before setting the tags the machines will be "invisible" to a destroy command, leaving a unmanageable cluster behind.

The initial proposal is to go back to the previous behavior for all cases but when the new flag (--security-group-prefix) is used.

Also it's worthwhile to mention that ~~SPARK-3180~~ introduced the --additional-security-group flag which is a reasonable solution to ~~SPARK-2333~~ (but isn't a full replacement to all use cases of --security-group-prefix).

Attachments

Issue Links

is related to

SPARK-4509 Revert EC2 tag-based cluster membership patch in branch-1.2

Closed

relates to

SPARK-4983 Add sleep() before tagging EC2 instances to allow instance metadata to propagate

Resolved

SPARK-2333 spark_ec2 script should allow option for existing security group

Resolved

links to

[Github] Pull Request #2223 (douglaz)

[Github] Pull Request #2225 (JoshRosen)

Activity

People

Assignee:: Josh Rosen

Reporter:: Allan Douglas R. de Oliveira

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 31/Aug/14 20:52

Updated:: 10/Mar/15 03:15

Resolved:: 26/Nov/14 00:08