Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-13787

PrometheusPushGatewayReporter does not cleanup TM metrics when run on kubernetes

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Won't Fix
    • 1.7.2, 1.8.1, 1.9.0
    • None
    • Runtime / Metrics
    • None

    Description

      I have run a flink job on kubernetes and use PrometheusPushGatewayReporter, I can see the metrics from the flink jobmanager and taskmanager from the push gateway's UI.

      When I cancel the job, I found the jobmanager's metrics disappear, but the taskmanager's metrics still exist, even though I have set the deleteOnShutdown to true_._

      The configuration is:

      metrics.reporters: "prom"
      metrics.reporter.prom.class: "org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter"
      metrics.reporter.prom.jobName: "WordCount"
      metrics.reporter.prom.host: "localhost"
      metrics.reporter.prom.port: "9091"
      metrics.reporter.prom.randomJobNameSuffix: "true"
      metrics.reporter.prom.filterLabelValueCharacters: "true"
      metrics.reporter.prom.deleteOnShutdown: "true"
      

       

      Other people have also encountered this problem: https://stackoverflow.com/questions/54420498/flink-prometheus-push-gateway-reporter-delete-metrics-on-job-shutdown.  And another similar issue: FLINK-11457.

       

      As prometheus is a very import metrics system on kubernetes, if we can solve this problem, it is beneficial for users to monitor their flink jobs.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            kaibo.zhou Kaibo Zhou
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment