Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-25400

Issue while determining live collector in case of HA

    XMLWordPrintableJSON

Details

    Description

      If collector throws http://collectorhost:port//ws/v1/timeline/metrics/livenodes 500 error then sink is unable to determine live/healthy collector.

       

      sink will try to connect to another collector only if it is not reachable/IOException to first collector.  if any other response code other than 200 then still it should consider as 1st Collector not reachable.

       

      https://github.com/apache/ambari/blob/release-2.7.4/ambari-metrics/ambari-metrics-common/src/main/java/org/apache/hadoop/metrics2/sink/timeline/AbstractTimelineMetricsSink.java#L629

       

       Possible fix

       

            if (responseCode == 200) {
              try (InputStream in = connection.getInputStream()) {
                StringWriter writer = new StringWriter();
                IOUtils.copy(in, writer);
                try {
                  collectors = gson.fromJson(writer.toString(), new TypeToken<List<String>>(){}.getType());
                } catch (JsonSyntaxException jse) {
                  // Swallow this at the behest of still trying to POST
                  LOG.debug("Exception deserializing the json data on live " +
                    "collector nodes.", jse);
                }
              }
            } else if (responseCode == 500){
              String warnMsg = "Unable to connect to collector to find live nodes, Internal server error";
              throw new MetricCollectorUnavailableException(warnMsg);
            }
      

       

      Attachments

        Activity

          People

            eberhardtp Éberhardt Péter
            apappu@hortonworks.com amarnath reddy pappu
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 40m
                1h 40m