Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-17368

exception message in PrometheusPushGatewayReporter

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Duplicate
    • Affects Version/s: 1.10.0
    • Fix Version/s: None
    • Component/s: Runtime / Metrics
    • Labels:
      None

      Description

      when sending flink metrics to prometheus pushgateway by using org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter, there are a lof of exception message in taskmanager log. Here is the exception stack:

      2020-04-23 18:16:44,927 WARN  org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter  - Failed to push metrics to PushGateway with jobName a517f2f8bb79b59abb5e596f34adca27, groupingKey {}.
      java.io.IOException: Response code from http://10.3.71.136:9091/metrics/job/a517f2f8bb79b59abb5e596f34adca27 was 200
              at org.apache.flink.shaded.io.prometheus.client.exporter.PushGateway.doRequest(PushGateway.java:297)
              at org.apache.flink.shaded.io.prometheus.client.exporter.PushGateway.push(PushGateway.java:127)
              at org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter.report(PrometheusPushGatewayReporter.java:109)
              at org.apache.flink.runtime.metrics.MetricRegistryImpl$ReporterTask.run(MetricRegistryImpl.java:441)
              at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
              at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
              at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
              at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
              at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
              at java.base/java.lang.Thread.run(Thread.java:834)
      

       

      After investigation, I found that it's io.prometheus:simpleclient_pushgateway:0.3.0 which casues the exception. before io.prometheus:simpleclient_pushgateway:0.8.0, io.prometheus.client.exporter.PushGateway#doRequest use response code 202 to decide whether a metric is successfully sended or not, so response code 200 indicates a failed transmission. In io.prometheus:simpleclient_pushgateway:0.8.0, response code 2xx is used to indicates a successful transmission.

       

      After we change the version of io.prometheus:simpleclient_pushgateway to 0.8.0 in flink-metrics-prometheus module, there have been no more such exception.

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                Leo Zhou zl
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: