Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Nagios gets NN JMX metrics. When CorruptBlocks >0, it sends alert to the administrator. However, the corrupted replica is not a concern, only corrupt block should be a alert. Corrupt replica happens often in some environments and this alert misleads some customers to think the data is lost.
The JMX metric "NumberOfMissingBlocks" is for corrupt block.
We want to do the following changes for Nagios alerting:
1) checks NumberOfMissingBlocks and sends alert when it's not zero. (already doing this in current Ambari code??)
2) doesn't send alert for "CorruptBlocks" value change.
Attachments
Attachments
Issue Links
- links to