[CHUKWA-487] Collector left in a bad state after temprorary NN outage - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Blocker
Resolution: Fixed
Affects Version/s: 0.4.0
Fix Version/s: None
Component/s: Data Collection
Labels:
None

Description

When the name node returns errors to the collector, at some point the collector dies half way. This behavior should be changed to either resemble the agents and keep trying, or to completely shutdown. Instead, what I'm seeing is that the collector logs that it's shutting down, and the var/pidDir/Collector.pid file gets removed, but the collector continues to run, albeit not handling new data. Instead, this log entry is repeated ad infinitum:

2010-05-06 17:35:06,375 INFO Timer-1 root - stats:ServletCollector,numberHTTPConnection:0,numberchunks:0
2010-05-06 17:36:06,379 INFO Timer-1 root - stats:ServletCollector,numberHTTPConnection:0,numberchunks:0
2010-05-06 17:37:06,384 INFO Timer-1 root - stats:ServletCollector,numberHTTPConnection:0,numberchunks:0

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

CHUKWA-487.patch
17/May/10 08:15
2 kB
Ariel Shemaiah Rabkin
CHUKWA-487.threaddump.txt
14/May/10 19:54
257 kB
William W. Graham Jr

Activity

People

Assignee:: Unassigned

Reporter:: William W. Graham Jr

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 10/May/10 20:46

Updated:: 21/Jul/10 09:21

Resolved:: 21/Jul/10 09:21

Agile

View on Board

Collector left in a bad state after temprorary NN outage

Details

Description

Attachments

Attachments

Activity

People

Dates

Agile

Slack

Issue deployment