Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Description: If the secondary NameNode(SNN) failed to merge edit files for any
reason, Nagios doesn't alert on it.
PROBLEM: For any reasons, SNN fails to merge edit files for long time it goes
undetected. This can cause the edit files to become very large and slows down
NameNode performance. And in some cases, can lead to corruption of NameNode
edit files.
BUSINESS IMPACT: If Nagios doesn't alert on SNN functionality, this will
eventually cause long downtime for all of customers and a possiblitly of data
loss.
STEPS TO REPRODUCE:
- SNN fails to merge edit files for any reason
- NameNode edit files grow in size
- Corruption to edit files.
ACTUAL BEHAVIOR: Nagios doesn't fire critical alarm
EXPECTED BEHAVIOR: Nagios should fire critical alarm
SUPPORT ANALYSIS: N/A
Note:
We need to get this fixed and alert our customers to add the nagios alarm
ASAP.
Attachments
Issue Links
- links to