Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-27966

HBase Master/RS JVM metrics populated incorrectly

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.0.0-alpha-4
    • 2.6.0, 2.5.6, 3.0.0-beta-1
    • metrics
    • None
    • Reviewed

    Description

      HBase Master/RS JVM metrics populated incorrectly due to regression causing ambari metrics system to not able to capture them.

      Based on my analysis the issue is relevant for all release post 2.0.0-alpha-4 and seems to be caused due to HBASE-18846.

      Have been able to compare the JVM metrics across 3 versions of HBase and attaching results of same below:

      HBase: 1.1.2

      {
          "name" : "Hadoop:service=HBase,name=JvmMetrics",
          "modelerType" : "JvmMetrics",
          "tag.Context" : "jvm",
          "tag.ProcessName" : "RegionServer",
          "tag.SessionId" : "",
          "tag.Hostname" : "HOSTNAME",
          "MemNonHeapUsedM" : 196.05664,
          "MemNonHeapCommittedM" : 347.60547,
          "MemNonHeapMaxM" : 4336.0,
          "MemHeapUsedM" : 7207.315,
          "MemHeapCommittedM" : 66080.0,
          "MemHeapMaxM" : 66080.0,
          "MemMaxM" : 66080.0,
          "GcCount" : 3953,
          "GcTimeMillis" : 662520,
          "ThreadsNew" : 0,
          "ThreadsRunnable" : 214,
          "ThreadsBlocked" : 0,
          "ThreadsWaiting" : 626,
          "ThreadsTimedWaiting" : 78,
          "ThreadsTerminated" : 0,
          "LogFatal" : 0,
          "LogError" : 0,
          "LogWarn" : 0,
          "LogInfo" : 0
        },
      

      HBase 2.0.2

      {
          "name" : "Hadoop:service=HBase,name=JvmMetrics",
          "modelerType" : "JvmMetrics",
          "tag.Context" : "jvm",
          "tag.ProcessName" : "IO",
          "tag.SessionId" : "",
          "tag.Hostname" : "HOSTNAME",
          "MemNonHeapUsedM" : 203.86688,
          "MemNonHeapCommittedM" : 740.6953,
          "MemNonHeapMaxM" : -1.0,
          "MemHeapUsedM" : 14879.477,
          "MemHeapCommittedM" : 31744.0,
          "MemHeapMaxM" : 31744.0,
          "MemMaxM" : 31744.0,
          "GcCount" : 75922,
          "GcTimeMillis" : 5134691,
          "ThreadsNew" : 0,
          "ThreadsRunnable" : 90,
          "ThreadsBlocked" : 3,
          "ThreadsWaiting" : 158,
          "ThreadsTimedWaiting" : 36,
          "ThreadsTerminated" : 0,
          "LogFatal" : 0,
          "LogError" : 0,
          "LogWarn" : 0,
          "LogInfo" : 0
        },
      

      HBase: 2.5.2

      {
            "name": "Hadoop:service=HBase,name=JvmMetrics",
            "modelerType": "JvmMetrics",
            "tag.Context": "jvm",
            "tag.ProcessName": "IO",
            "tag.SessionId": "",
            "tag.Hostname": "HOSTNAME",
            "MemNonHeapUsedM": 192.9798,
            "MemNonHeapCommittedM": 198.4375,
            "MemNonHeapMaxM": -1.0,
            "MemHeapUsedM": 773.23584,
            "MemHeapCommittedM": 1004.0,
            "MemHeapMaxM": 1024.0,
            "MemMaxM": 1024.0,
            "GcCount": 2048,
            "GcTimeMillis": 25440,
            "ThreadsNew": 0,
            "ThreadsRunnable": 22,
            "ThreadsBlocked": 0,
            "ThreadsWaiting": 121,
            "ThreadsTimedWaiting": 49,
            "ThreadsTerminated": 0,
            "LogFatal": 0,
            "LogError": 0,
            "LogWarn": 0,
            "LogInfo": 0
       },
      

      It can be observed that 2.0.x onwards the field "tag.ProcessName" is populating as "IO" instead of expected "RegionServer" or "Master".

      Ambari relies on this field process name to create a metric 'jvm.RegionServer.JvmMetrics.GcTimeMillis' etc. See code.

      But post 2.0.x the field is getting populated as 'IO' and hence a metric with name 'jvm.JvmMetrics.GcTimeMillis' is created instead of expected 'jvm.RegionServer.JvmMetrics.GcTimeMillis', thus mixing up the metric with various other metrics coming from rs, master, spark executor etc. running on same host.

      Expected
      Field "tag.ProcessName" should be populated as "RegionServer" or "Master" instead of "IO".

      Actual
      Field "tag.ProcessName" is populating as "IO" instead of expected "RegionServer" or "Master" causing incorrect metric being published by ambari and thus mixing up all metrics and raising various alerts around JVM metrics.

      Attachments

        1. test_patch.txt
          4 kB
          Nihal Jain

        Issue Links

          Activity

            People

              nihaljain.cs Nihal Jain
              nihaljain.cs Nihal Jain
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: