Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-13787

PrometheusPushGatewayReporter does not cleanup TM metrics when run on kubernetes

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.7.2, 1.8.1, 1.9.0
    • Fix Version/s: None
    • Component/s: Runtime / Metrics
    • Labels:
      None

      Description

      I have run a flink job on kubernetes and use PrometheusPushGatewayReporter, I can see the metrics from the flink jobmanager and taskmanager from the push gateway's UI.

      When I cancel the job, I found the jobmanager's metrics disappear, but the taskmanager's metrics still exist, even though I have set the deleteOnShutdown to true_._

      The configuration is:

      metrics.reporters: "prom"
      metrics.reporter.prom.class: "org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter"
      metrics.reporter.prom.jobName: "WordCount"
      metrics.reporter.prom.host: "localhost"
      metrics.reporter.prom.port: "9091"
      metrics.reporter.prom.randomJobNameSuffix: "true"
      metrics.reporter.prom.filterLabelValueCharacters: "true"
      metrics.reporter.prom.deleteOnShutdown: "true"
      

       

      Other people have also encountered this problem: https://stackoverflow.com/questions/54420498/flink-prometheus-push-gateway-reporter-delete-metrics-on-job-shutdown.  And another similar issue: FLINK-11457.

       

      As prometheus is a very import metrics system on kubernetes, if we can solve this problem, it is beneficial for users to monitor their flink jobs.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              kaibo.zhou Kaibo Zhou
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated: