Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-11742

Push metrics to Pushgateway without "instance"

    XMLWordPrintableJSON

Details

    Description

      According to the official article,

      https://prometheus.io/docs/concepts/jobs_instances/

      https://github.com/prometheus/pushgateway

      when sending a metric to Prometheus Pushgateway, you need to give an "instance" message.
      In actual use, after there is no "instance", Prometheus stores metrics with problems, metrics are not continuous, and a lot of data is lost. After adding instance, it returns to normal.

       

      no "instance" 

       

      with "instance"

       

       

      In Prometheus terms, an endpoint you can scrape is called an instance, usually corresponding to a single process. A collection of instances with the same purpose, a process replicated for scalability or reliability for example, is called a job.

      For example, an API server job with four replicated instances:
      job: api-server
      – instance 1: 1.2.3.4:5670
      – instance 2: 1.2.3.4:5671
      – instance 3: 5.6.7.8:5670
      – instance 4: 5.6.7.8:5671

      https://prometheus.io/docs/concepts/jobs_instances/#jobs-and-instances

      I think a Flink job corresponds to a Prometheus job, and taskmanager and jobmanager correspond to different instances. If the jobName is used as the instance label, the same metrics of different tasksmanages will conflict, and operations such as sum will fail.

      Attachments

        1. image-2019-02-25-17-16-28-618.png
          92 kB
          Tom Goong
        2. image-2019-02-25-17-16-59-034.png
          72 kB
          Tom Goong

        Issue Links

          Activity

            People

              Unassigned Unassigned
              Tom Goong Tom Goong
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m