Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-13787

PrometheusPushGatewayReporter does not cleanup TM metrics when run on kubernetes

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Won't Fix
    • 1.7.2, 1.8.1, 1.9.0
    • None
    • Runtime / Metrics
    • None

    Description

      I have run a flink job on kubernetes and use PrometheusPushGatewayReporter, I can see the metrics from the flink jobmanager and taskmanager from the push gateway's UI.

      When I cancel the job, I found the jobmanager's metrics disappear, but the taskmanager's metrics still exist, even though I have set the deleteOnShutdown to true_._

      The configuration is:

      metrics.reporters: "prom"
      metrics.reporter.prom.class: "org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter"
      metrics.reporter.prom.jobName: "WordCount"
      metrics.reporter.prom.host: "localhost"
      metrics.reporter.prom.port: "9091"
      metrics.reporter.prom.randomJobNameSuffix: "true"
      metrics.reporter.prom.filterLabelValueCharacters: "true"
      metrics.reporter.prom.deleteOnShutdown: "true"
      

       

      Other people have also encountered this problem: https://stackoverflow.com/questions/54420498/flink-prometheus-push-gateway-reporter-delete-metrics-on-job-shutdown.  And another similar issue: FLINK-11457.

       

      As prometheus is a very import metrics system on kubernetes, if we can solve this problem, it is beneficial for users to monitor their flink jobs.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              kaibo.zhou Kaibo Zhou
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: