Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
Reviewed
Description
Replication exports metric ageOfLastShippedOp as an indication of how much replication is lagging. But, with multiwal enabled, it's not representative because replication could be lagging for a long time for one wal group (something wrong with a particular region) while being fine for others. The ageOfLastShippedOp becomes a useless metric for alerting in such a case.
Also, since there is no mapping between individual replication sources and replication sinks, the age of last applied op can be a highly spiky metric if only certain replication sources are lagging.
We should use histograms for these metrics and use maximum value of this histogram to report replication lag when building stats.
Attachments
Attachments
Issue Links
- relates to
-
HBASE-17579 Backport HBASE-16302 to 1.3.1
- Closed