HBase
  1. HBase
  2. HBASE-2146

RPC related metrics are missing in 0.20.3 since recent changes

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.3
    • Fix Version/s: 0.20.3, 0.90.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Since the recent change to the Metrics setup it seems that the RPC related stats have been completely dropped. See attached files for details.

      1. jconsole-hbase-rpc.png
        147 kB
        Lars George
      2. jmx-diffs.txt
        33 kB
        Lars George
      3. HBASE-2146_0.20.patch
        3 kB
        Gary Helmling
      4. HBASE-2146_trunk.patch
        2 kB
        Gary Helmling

        Activity

        Hide
        stack added a comment -

        Applied branch. Resolving. Thanks for the patch Gary.

        Show
        stack added a comment - Applied branch. Resolving. Thanks for the patch Gary.
        Hide
        Lars George added a comment -

        Works great, will commit to trunk and hold of with branch until after RC2 has passed.

        Show
        Lars George added a comment - Works great, will commit to trunk and hold of with branch until after RC2 has passed.
        Hide
        Lars George added a comment -

        Assigning to Gary who did the work.

        Show
        Lars George added a comment - Assigning to Gary who did the work.
        Hide
        Lars George added a comment -

        Will test first thing tomorrow, I'll let you know. Thanks!

        Show
        Lars George added a comment - Will test first thing tomorrow, I'll let you know. Thanks!
        Hide
        Gary Helmling added a comment -

        Same patch as HBASE-2146_0.20.patch against trunk.

        Show
        Gary Helmling added a comment - Same patch as HBASE-2146 _0.20.patch against trunk.
        Hide
        Gary Helmling added a comment -

        Lars, can you try this patch on your setup?

        I did a quick test with jconsole and all the rpc-method metrics seemed to show up on startup for me.

        This patch also drops the duplicate incrementing of rpcQueueTime and rpcProcessingTime.

        Show
        Gary Helmling added a comment - Lars, can you try this patch on your setup? I did a quick test with jconsole and all the rpc-method metrics seemed to show up on startup for me. This patch also drops the duplicate incrementing of rpcQueueTime and rpcProcessingTime.
        Hide
        Gary Helmling added a comment -

        Patch against 0.20 branch to pre-register per-RPC-method metrics, so that the full attribute list is initially available for JMX MBean creation.

        This adds a default access HBaseRPC.getMappedMethodNames() static method to retrieve the values of the HBaseRPC.Invocation.CODE_TO_METHODNAMES map, which isn't super elegant. But it seemed like the least intrusive way to get what we need.

        Show
        Gary Helmling added a comment - Patch against 0.20 branch to pre-register per-RPC-method metrics, so that the full attribute list is initially available for JMX MBean creation. This adds a default access HBaseRPC.getMappedMethodNames() static method to retrieve the values of the HBaseRPC.Invocation.CODE_TO_METHODNAMES map, which isn't super elegant. But it seemed like the least intrusive way to get what we need.
        Hide
        Gary Helmling added a comment -

        Not having all the attributes initially available does seem a bit problematic if you have automated monitoring setup. You're likely to get an AttributeNotFoundException if you have a process trying to retrieve it before it's there.

        I'll take a shot at pre-registering those metrics. I think we can do it with the code <-> method name map in HBaseRPC. I'll post a patch for that (and remove the duplication) shortly.

        Show
        Gary Helmling added a comment - Not having all the attributes initially available does seem a bit problematic if you have automated monitoring setup. You're likely to get an AttributeNotFoundException if you have a process trying to retrieve it before it's there. I'll take a shot at pre-registering those metrics. I think we can do it with the code <-> method name map in HBaseRPC. I'll post a patch for that (and remove the duplication) shortly.
        Hide
        Lars George added a comment -

        And Gary is right.

        There is a duplicate call in the debug!

                if (LOG.isDebugEnabled()) {
                  LOG.debug("Served: " + call.getMethodName() +
                    " queueTime= " + qTime +
                    " procesingTime= " + processingTime);
                  rpcMetrics.rpcQueueTime.inc(qTime);
                  rpcMetrics.rpcProcessingTime.inc(processingTime);
                }
                rpcMetrics.rpcQueueTime.inc(qTime);
                rpcMetrics.rpcProcessingTime.inc(processingTime);
                rpcMetrics.inc(call.getMethodName(), processingTime);
        

        Also, the last line is what I could not see (forrest for the trees kind of thing).

        So we make this issue to patch the above duplicate inc and close it?

        Show
        Lars George added a comment - And Gary is right. There is a duplicate call in the debug! if (LOG.isDebugEnabled()) { LOG.debug( "Served: " + call.getMethodName() + " queueTime= " + qTime + " procesingTime= " + processingTime); rpcMetrics.rpcQueueTime.inc(qTime); rpcMetrics.rpcProcessingTime.inc(processingTime); } rpcMetrics.rpcQueueTime.inc(qTime); rpcMetrics.rpcProcessingTime.inc(processingTime); rpcMetrics.inc(call.getMethodName(), processingTime); Also, the last line is what I could not see (forrest for the trees kind of thing). So we make this issue to patch the above duplicate inc and close it?
        Hide
        Lars George added a comment -

        Ah, thanks Gary, that makes sense. I guess the issue is that my second cluster is not busy and has not yet created the counters. This makes for a difficult discovery though then. Sure, you could run the discovery later, but it would be nice to get an authoritative list from somewhere.

        Show
        Lars George added a comment - Ah, thanks Gary, that makes sense. I guess the issue is that my second cluster is not busy and has not yet created the counters. This makes for a difficult discovery though then. Sure, you could run the discovery later, but it would be nice to get an authoritative list from somewhere.
        Hide
        Gary Helmling added a comment -

        Lars,

        It looks like these extra metrics are dynamically created and added to the registry via the HBaseRPCMetrics.inc() and subsequently create() and get() methods. The org.apache.hadoop.metrics.util.MetricsDynamicMBeanBase class will update the MBean info if the underlying registry changes (see updateMbeanInfoIfMetricsListChanged()), but maybe this isn't being invoked everywhere it's needed?

        The .inc() method is being called in HBaseRPC.Server.call() with the call method name. So it looks like that's how they're getting populated. – As an aside it looks like rpcQueueTIme and rpcProcessingTime are incremented twice there is debug logging is enabled – I'll try to test this out as well a bit later.

        Show
        Gary Helmling added a comment - Lars, It looks like these extra metrics are dynamically created and added to the registry via the HBaseRPCMetrics.inc() and subsequently create() and get() methods. The org.apache.hadoop.metrics.util.MetricsDynamicMBeanBase class will update the MBean info if the underlying registry changes (see updateMbeanInfoIfMetricsListChanged()), but maybe this isn't being invoked everywhere it's needed? The .inc() method is being called in HBaseRPC.Server.call() with the call method name. So it looks like that's how they're getting populated. – As an aside it looks like rpcQueueTIme and rpcProcessingTime are incremented twice there is debug logging is enabled – I'll try to test this out as well a bit later.
        Hide
        Lars George added a comment -

        Hmm, I checked both cluster then they have the exact same hadoop-metrics, hbase-env and so on. All diff"ed by my a few times. So why is that cluster different. What is even weirder is that these are MetricsTimeVaryingRate instances so they must be declared and put into the MetricsRegistry before createMBean() is called. And I cannot find where that is done!

        Show
        Lars George added a comment - Hmm, I checked both cluster then they have the exact same hadoop-metrics, hbase-env and so on. All diff"ed by my a few times. So why is that cluster different. What is even weirder is that these are MetricsTimeVaryingRate instances so they must be declared and put into the MetricsRegistry before createMBean() is called. And I cannot find where that is done!
        Hide
        stack added a comment -

        @LarsG For JMX do they have to be enabled in the hadoop-metrics.properties file (Maybe not?).

        Show
        stack added a comment - @LarsG For JMX do they have to be enabled in the hadoop-metrics.properties file (Maybe not?).
        Hide
        Lars George added a comment -

        Err, this is confusing. I cannot even find where these Metrics are exported in 0.20.2! Am I missing something? Do I have a special build that had those values? Why were they there and why were they removed? Bueller?

        Show
        Lars George added a comment - Err, this is confusing. I cannot even find where these Metrics are exported in 0.20.2! Am I missing something? Do I have a special build that had those values? Why were they there and why were they removed? Bueller?

          People

          • Assignee:
            Gary Helmling
            Reporter:
            Lars George
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development