Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
When a user runs diagnostics tasks ( bin/nifi.sh diagnostics <filename>) the output has many details, including cluster information. But the cluster information is fairly minimal. We should gather and provide more information here, especially around node disconnections.
For example, how many times has each node disconnected/reconnected to the cluster?
What was the reason for each of the disconnection/reconnections?
What were the timestamps so that those events can be correlated with other events that may have been occurring during that time?
How many times has the node lost connection to ZooKeeper since startup? When was the last time?
What's the max/max/average time taken to heartbeat over the last hour?
And any other diagnostics information that could be relevant for diagnosing cluster disconnection issues.