Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-20309

HBase Master CPU Utilization Alert is in unknown state due to kinit error

    XMLWordPrintableJSON

Details

    Description

      HBase Master CPU Utilization Alert is in unknown state due to kinit error:

      Execution of '/usr/bin/kinit -c /var/lib/ambari-agent/tmp/curl_krb_cache/metric_alert_ambari-qa_cc_56787c2122a8214ca9775f3433361f8b -kt HTTP/_HOST@EXAMPLE.COM /etc/security/keytabs/spnego.service.keytab > /dev/null' returned 1. kinit: Client not found in Kerberos database while getting initial credentials
      

      This issue is also seen in /var/log/krb5kdc.log:

      Mar 03 16:43:06 c6401.ambari.apache.org krb5kdc[4749](info): AS_REQ (4 etypes {18 17 16 23}) 192.168.64.101: CLIENT_NOT_FOUND: /etc/security/keytabs/spnego.service.keytab@EXAMPLE.COM for krbtgt/EXAMPLE.COM@EXAMPLE.COM, Client not found in Kerberos database
      

      Cause
      It appears that the HBASE alerts.json file (common-services/HBASE/0.96.0.2.0/alerts.json) has swapped values for the kerberos_keytab and kerberos_principal properties.

            {
              "name": "hbase_master_cpu",
              "label": "HBase Master CPU Utilization",
              "description": "This host-level alert is triggered if CPU utilization of the HBase Master exceeds certain warning and critical thresholds. It checks the HBase Master JMX Servlet for the SystemCPULoad property. The threshold values are in percent.",
              "interval": 5,
              "scope": "ANY",
              "enabled": true,
              "source": {
                "type": "METRIC",
                "uri": {
                  "http": "{{hbase-site/hbase.master.info.port}}",
                  "default_port": 60010,
                  "connection_timeout": 5.0,
                  "kerberos_keytab": "{{hbase-site/hbase.security.authentication.spnego.kerberos.principal}}",
                  "kerberos_principal": "{{hbase-site/hbase.security.authentication.spnego.kerberos.keytab}}"
                },
                "reporting": {
                  "ok": {
                    "text": "{1} CPU, load {0:.1%}"
                  },
                  "warning": {
                    "text": "{1} CPU, load {0:.1%}",
                    "value": 200
                  },
                  "critical": {
                    "text": "{1} CPU, load {0:.1%}",
                    "value": 250
                  },
                  "units" : "%",
                  "type": "PERCENT"
                },
                "jmx": {
                  "property_list": [
                    "java.lang:type=OperatingSystem/SystemCpuLoad",
                    "java.lang:type=OperatingSystem/AvailableProcessors"
                  ],
                  "value": "{0} * 100"
                }
              }
            }
      

      Notice:

                  "kerberos_keytab": "{{hbase-site/hbase.security.authentication.spnego.kerberos.principal}}",
                  "kerberos_principal": "{{hbase-site/hbase.security.authentication.spnego.kerberos.keytab}}"
      

      Solution
      Fix values for the kerberos_keytab and kerberos_principal properties in common-services/HBASE/0.96.0.2.0/alerts.json:

                  "kerberos_principal": "{{hbase-site/hbase.security.authentication.spnego.kerberos.principal}}",
                  "kerberos_keytab": "{{hbase-site/hbase.security.authentication.spnego.kerberos.keytab}}"
      

      Attachments

        1. AMBARI-20309_branch-2.5_01.patch
          85 kB
          Robert Levas
        2. AMBARI-20309_branch-2.5_02.patch
          87 kB
          Robert Levas
        3. AMBARI-20309_trunk_01.patch
          89 kB
          Robert Levas
        4. AMBARI-20309_trunk_02.patch
          90 kB
          Robert Levas

        Issue Links

          Activity

            People

              rlevas Robert Levas
              rlevas Robert Levas
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: