Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-25326

AMS - no HBase and Hive metrics post-upgrade when using 2 collectors

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.7.3
    • Fix Version/s: None
    • Component/s: ambari-metrics
    • Labels:
      None

      Description

      Seems like a bug when 2 metric collectors are deployed. Hive and hbase services are not able to send metrics

      Error : 2019-06-10 02:42:59,215 INFO timeline timeline.HadoopTimelineMetricsSink: No live collector to send metrics to. Metrics to be sent will be discarded. This message will be skipped for the next 20
      Debug Error shows this :
      2019-06-14 20:35:29,538 DEBUG main timeline.HadoopTimelineMetricsSink: Trying to find live collector host from : bolhdppname5.micron.com,bolhdppname4.micron.com
      2019-06-14 20:35:29,538 DEBUG main timeline.HadoopTimelineMetricsSink: Requesting live collector nodes : http://bolhdppname5.micron.com,bolhdppname4.micron.com:6188/ws/v1/timeline/metrics/livenodes
      2019-06-14 20:35:29,557 DEBUG main timeline.HadoopTimelineMetricsSink: Unable to connect to collector, http://bolhdppname5.micron.com,bolhdppname4.micron.com:6188/ws/v1/timeline/metrics/livenodes
      2019-06-14 20:35:29,557 DEBUG main timeline.HadoopTimelineMetricsSink: java.net.UnknownHostException: bolhdppname5.micron.com,bolhdppname4.micron.com
      2019-06-14 20:35:29,558 DEBUG main timeline.HadoopTimelineMetricsSink: Collector bolhdppname5.micron.com,bolhdppname4.micron.com is not longer live. Removing it from list of know live collector hosts : []
      2019-06-14 20:35:29,558 DEBUG main timeline.HadoopTimelineMetricsSink: No live collectors from configuration.
      

      Its incorrectly parsing hostnames when there are 2 collectors.
      Hive service and Hbase service have ability to determine the live collectors either through curl or zookeeper but the configs doesn't support fetching live collector node from zookeeper.
      To work around this, we added
      for hbase

      *.sink.timeline.zookeeper.quorum=bolhdppname5.micron.com:2181,bolhdppname1.micron.com:2181,bolhdppname4.micron.com:2181,bolhdppname2.micron.com:2181,bolhdppname3.micron.com:2181
      

      in
      /var/lib/ambari-server/resources/stacks/HDP/3.0/services/HBASE/package/templates/hadoop-metrics2-hbase.properties-GANGLIA-MASTER.j2
      and for hive
      Add

      *.sink.timeline.zookeeper.quorum=bolhdppname5.micron.com:2181,bolhdppname1.micron.com:2181,bolhdppname4.micron.com:2181,bolhdppname2.micron.com:2181,bolhdppname3.micron.com:2181
      

      in all 4 files under /var/lib/ambari-server/resources/stacks/HDP/3.0/services/HIVE/package/templates/ ( on ambari server )

      root@c1207-node1 templates# ll | grep metr
      -rwxr-xr-x 1 root root 3032 Sep 18 2018 hadoop-metrics2-hivemetastore.properties.j2
      -rwxr-xr-x 1 root root 3016 Sep 18 2018 hadoop-metrics2-hiveserver2.properties.j2
      -rwxr-xr-x 1 root root 2959 Sep 18 2018 hadoop-metrics2-llapdaemon.j2
      -rwxr-xr-x 1 root root 3015 Sep 18 2018 hadoop-metrics2-llaptaskscheduler.j2
      

        Attachments

          Activity

            People

            • Assignee:
              gboros Gabor Boros
              Reporter:
              gboros Gabor Boros
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: