[AMBARI-15141] Start all services request aborts in the middle and hosts go into heartbeat-lost state - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: 2.2.0
Fix Version/s: 2.2.2
Component/s: ambari-server
Labels:
None

Description

On the 1600 node cluster I attempted to do Stop-All and Start-All actions. During both actions, the request would abort itself - sometimes even when there were no failures or timeouts. Also, during this time hosts would go into heartbeat-lost state and the cluster would look like its broken. After like 5-10 minutes, the hosts slowly come back online and correct host-status is got. But due to request abort, some components would be stopped/started when they should not be.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

AMBARI-15141_branch-2.2.patch
23/Feb/16 13:44
215 kB
Papirkovskyy Myroslav
AMBARI-15141.patch
23/Feb/16 13:44
216 kB
Papirkovskyy Myroslav

Issue Links

links to

ReviewBoard

Activity

People

Assignee:: Papirkovskyy Myroslav

Reporter:: Papirkovskyy Myroslav

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 23/Feb/16 13:30

Updated:: 24/Feb/16 01:44

Resolved:: 23/Feb/16 19:16