[WHIRR-414] whirr can have a non-zero return code and unterminated (orphaned) host instances - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: 0.6.0
Fix Version/s: 0.7.0
Component/s: core
Labels:
None
Environment:

EC2, commandline whirr

Description

Whirr can fail to completely start a cluster and indicates this with a non-zero return code. In many (currently intermittent) partial failure scenarios, there are resources still active (EC2 machine instances, in my experience) that are not cleaned up.

The log contains "IOException: Too many instance failed while bootstrapping!" when I have seen orphaned nodes.

A non-zero return code should guarantee that all resources are cleaned up. Without this post-condition, these failures require manual inspection and cleanup to stop useless expenses (which is why I marked this bug critical; it needs to be addressed for any kind of cron job triggered whirr).

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

WHIRR-414.patch
28/Oct/11 09:30
2 kB
Andrei Savu
WHIRR-414.patch
08/Nov/11 21:39
7 kB
David Alves
WHIRR-414.patch
08/Nov/11 21:57
7 kB
David Alves
WHIRR-414.patch
09/Nov/11 01:42
8 kB
Andrei Savu
WHIRR-414.patch
10/Nov/11 11:08
6 kB
Andrei Savu
WHIRR-414-ignore-missing-instances-file.patch
10/Nov/11 14:01
3 kB
Andrei Savu
WHIRR-414-ignore-missing-instances-file.patch
11/Nov/11 22:17
3 kB
Andrei Savu

Activity

People

Assignee:: Andrei Savu

Reporter:: Paul Baclace

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 27/Oct/11 23:58

Updated:: 13/Nov/11 07:37

Resolved:: 13/Nov/11 07:37