Details
-
Bug
-
Status: In Progress
-
Not a Priority
-
Resolution: Unresolved
-
None
-
None
Description
According to the official article,
https://prometheus.io/docs/concepts/jobs_instances/
https://github.com/prometheus/pushgateway
when sending a metric to Prometheus Pushgateway, you need to give an "instance" message.
In actual use, after there is no "instance", Prometheus stores metrics with problems, metrics are not continuous, and a lot of data is lost. After adding instance, it returns to normal.
no "instance"
with "instance"
In Prometheus terms, an endpoint you can scrape is called an instance, usually corresponding to a single process. A collection of instances with the same purpose, a process replicated for scalability or reliability for example, is called a job.
For example, an API server job with four replicated instances:
job: api-server
– instance 1: 1.2.3.4:5670
– instance 2: 1.2.3.4:5671
– instance 3: 5.6.7.8:5670
– instance 4: 5.6.7.8:5671
https://prometheus.io/docs/concepts/jobs_instances/#jobs-and-instances
I think a Flink job corresponds to a Prometheus job, and taskmanager and jobmanager correspond to different instances. If the jobName is used as the instance label, the same metrics of different tasksmanages will conflict, and operations such as sum will fail.
Attachments
Attachments
Issue Links
- links to