Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
2.0.0
-
None
Description
Alerts "suppress" maintenance mode by indicating a maintenance_state attribute in addition to the actual state which is being reported:
"Alert": { "cluster_name": "c1", "component_name": "METRICS_COLLECTOR", "definition_id": 43, "definition_name": "ams_metrics_collector_process", "host_name": "c6401.ambari.apache.org", "id": 28, "instance": null, "label": "Metrics Collector Process", "latest_timestamp": 1457108946118, "maintenance_state": "ON", "original_timestamp": 1457108646099, "scope": "ANY", "service_name": "AMBARI_METRICS", "state": "CRITICAL", "text": "Connection failed: [Errno 111] Connection refused to c6401.ambari.apache.org" }
When a host/service/component is placed into MM, the database is updated so that all alert_current rows which are affected have their MM updated as well.
However, this fails under two scenarios:
- The alert hasn't been received yet in a brand new cluster
- The alert definition was disabled, which removed all current alerts. Then, it was re-enabled.
In both cases, when constructing a new AlertCurrentEntity, we need to calculate the correct maintenance state.
Attachments
Attachments
Issue Links
- links to