Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-15533

HDFS Alerts for AMS Throw 'invalid literal for int() with base 10: '50.0''

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 2.2.2
    • 2.2.2
    • ambari-agent
    • None

    Description

      SCRIPT alerts stuck in UNKNWN status with response message 'invalid literal for int() with base 10: '50.0''.

      It is noticed that the error is thrown only after a PUT alertDefinition call to update few parameters of alert definition since the numeric values are changed into strings.

      The scripts need to safely cast their parameters; the fix is in the script here.

      E.g PUT Request :

      PUT http://172.22.114.20:8080/api/v1/clusters/cl1/alert_definitions/38
      {
        "AlertDefinition" : {
          "cluster_name" : "cl1",
          "component_name" : "NAMENODE",
          "description" : "This service-level alert is triggered if the NN heap usage deviation has grown beyond the specified threshold within a given time interval.",
          "enabled" : true,
          "id" : 38,
          "ignore_host" : false,
          "interval" : 2,
          "label" : "NameNode Heap Usage (Daily)",
          "name" : "increase_nn_heap_usage_daily",
          "scope" : "ANY",
          "service_name" : "HDFS",
          "source" : {
            "parameters" : [
              {
                "name" : "mergeHaMetrics",
                "display_name" : "Whether active and stanby NameNodes metrics should be merged",
                "value" : "false",
                "description" : "Whether active and stanby NameNodes metrics should be merged.",
                "type" : "STRING"
              },
              {
                "name" : "interval",
                "display_name" : "Time interval in minutes",
                "value" : 1441.0,
                "description" : "Time interval in minutes.",
                "type" : "NUMERIC"
              },
              {
                "name" : "appId",
                "display_name" : "AMS application id",
                "value" : "NAMENODE",
                "description" : "The application id used to retrieve the metric.",
                "type" : "STRING"
              },
              {
                "name" : "metricName",
                "display_name" : "Metric Name",
                "value" : "jvm.JvmMetrics.MemHeapUsedM",
                "description" : "The metric to monitor.",
                "type" : "STRING"
              },
              {
                "name" : "metric.deviation.warning.threshold",
                "display_name" : "The standard deviation threshold above which a warning is produced.",
                "units" : "%",
                "value" : 20.0,
                "type" : "PERCENT",
                "threshold" : "WARNING"
              },
              {
                "name" : "metric.deviation.critical.threshold",
                "display_name" : "The standard deviation threshold above which a critical alert is produced.",
                "units" : "%",
                "value" : 50.0,
                "type" : "PERCENT",
                "threshold" : "CRITICAL"
              }
            ],
            "path" : "HDFS/2.1.0.2.0/package/alerts/alert_metrics_deviation.py",
            "type" : "SCRIPT"
          }
        }
      }
      
      

      Response :200OK

      GET http://172.22.114.20:8080/api/v1/clusters/cl1/alert_definitions/38
      {
        "href" : "http://172.22.114.20:8080/api/v1/clusters/cl1/alert_definitions/38",
        "AlertDefinition" : {
          "cluster_name" : "cl1",
          "component_name" : "NAMENODE",
          "description" : "This service-level alert is triggered if the NN heap usage deviation has grown beyond the specified threshold within a given time interval.",
          "enabled" : true,
          "id" : 38,
          "ignore_host" : false,
          "interval" : 2,
          "label" : "NameNode Heap Usage (Daily)",
          "name" : "increase_nn_heap_usage_daily",
          "scope" : "ANY",
          "service_name" : "HDFS",
          "source" : {
            "parameters" : [
              {
                "display_name" : "Whether active and stanby NameNodes metrics should be merged",
                "description" : "Whether active and stanby NameNodes metrics should be merged.",
                "name" : "mergeHaMetrics",
                "value" : "false",
                "type" : "STRING"
              },
              {
                "display_name" : "Time interval in minutes",
                "description" : "Time interval in minutes.",
                "name" : "interval",
                "value" : "1441.0",
                "type" : "NUMERIC"
              },
              {
                "display_name" : "AMS application id",
                "description" : "The application id used to retrieve the metric.",
                "name" : "appId",
                "value" : "NAMENODE",
                "type" : "STRING"
              },
              {
                "display_name" : "Metric Name",
                "description" : "The metric to monitor.",
                "name" : "metricName",
                "value" : "jvm.JvmMetrics.MemHeapUsedM",
                "type" : "STRING"
              },
              {
                "display_name" : "The standard deviation threshold above which a warning is produced.",
                "name" : "metric.deviation.warning.threshold",
                "value" : "20.0",
                "type" : "PERCENT",
                "units" : "%",
                "threshold" : "WARNING"
              },
              {
                "display_name" : "The standard deviation threshold above which a critical alert is produced.",
                "name" : "metric.deviation.critical.threshold",
                "value" : "50.0",
                "type" : "PERCENT",
                "units" : "%",
                "threshold" : "CRITICAL"
              }
            ],
            "path" : "HDFS/2.1.0.2.0/package/alerts/alert_metrics_deviation.py",
            "type" : "SCRIPT"
          }
        }
      }
      

      Attachments

        1. AMBARI-15533.patch
          3 kB
          Jonathan Hurley

        Issue Links

          Activity

            People

              jonathanhurley Jonathan Hurley
              jonathanhurley Jonathan Hurley
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: