Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
When receiving a message the PassiveGossipThread updates heartbeats. Currently a lambda in the GossipManager, which periodically moves through the list and marks hosts as down and fires the event notification listner:
scheduledServiced.scheduleAtFixedRate(() -> { try { for (Entry<LocalGossipMember, GossipState> entry : members.entrySet()) { Double result = null; try { result = entry.getKey().detect(clock.nanoTime()); //System.out.println(entry.getKey() +" "+ result); if (result != null) { if (result > settings.getConvictThreshold() && entry.getValue() == GossipState.UP) { members.put(entry.getKey(), GossipState.DOWN); listener.gossipEvent(entry.getKey(), GossipState.DOWN); } if (result <= settings.getConvictThreshold() && entry.getValue() == GossipState.DOWN) { members.put(entry.getKey(), GossipState.UP); listener.gossipEvent(entry.getKey(), GossipState.UP); } } } catch (IllegalArgumentException ex) { //0.0 returns throws exception computing the mean. long now = clock.nanoTime(); long nowInMillis = TimeUnit.MILLISECONDS.convert(now,TimeUnit.NANOSECONDS); if (nowInMillis - settings.getCleanupInterval() > entry.getKey().getHeartbeat() && entry.getValue() == GossipState.UP){ LOGGER.warn("Marking down"); members.put(entry.getKey(), GossipState.DOWN); listener.gossipEvent(entry.getKey(), GossipState.DOWN); } } //end catch } // end for } catch (RuntimeException ex) { LOGGER.warn("scheduled state had exception", ex); }
This should be moved to a named class that is injected with the data members it needs. This would make the logic easier to unit/mock test. We need to run it periodically in the rare case that no messages are coming to us, but we could also run this after receiving a message rather than waiting for the scheduled executor to trigger it. In many cases that would alert faster.
Attachments
Issue Links
- links to