Details
Description
Today, via JMX, one cannot distinguish a primary region from a replica. A possible solution is to add replica id to JMX metrics names. The benefits may include, for example:
- Knowing the latency of a read request on a replica region means the first attempt to the primary region has timeout.
- Write requests on replicas are due to the replication process, while the ones on primary are from clients.
- In case of looking for hot spots of read operations, replicas should be excluded since TIMELINE reads are sent to all replicas.
To implement, we can change the format of metrics names found at
Hadoop->HBase->RegionServer->Regions->Attributes
from
namespace_<namespace>_table_<tablename>_region_<regionname>_metric_<metricname>
to
namespace_<namespace>_table_<tablename>_region_<regionname>_replicaid_<replicaid>_metric_<metricname>