Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-11251

Incompatible metric name on prometheus reporter

    XMLWordPrintableJSON

Details

    Description

      # HELP flink_taskmanager_job_task_operator_KafkaConsumer_topic_partition_4_currentOffsets currentOffsets (scope: taskmanager_job_task_operator_KafkaConsumer_topic_partition_4)
      # TYPE flink_taskmanager_job_task_operator_KafkaConsumer_topic_partition_4_currentOffsets gauge
      flink_taskmanager_job_task_operator_KafkaConsumer_topic_partition_4_currentOffsets{task_attempt_id="5137e35cf7319787f6cd627621fd2ea7",host="localhost",task_attempt_num="0",tm_id="e72a527652f5af1358bdbc0f5bf6f49d",partition="4",topic="rt_lookback_state",job_id="546cf6f0d1f0b818afd9697c612f715c",task_id="d7b1ad914351f9ee5272ffff67f51160",operator_id="d7b1ad914351f9ee5272ffff67f51160",operator_name="Source:_kafka_lookback_state_source",task_name="Source:_kafka_lookback_state_source",job_name="FlinkRuleMatchPipeline",subtask_index="7",} 1.456090927E9
      # HELP flink_taskmanager_job_task_operator_KafkaConsumer_topic_partition_24_committedOffsets committedOffsets (scope: taskmanager_job_task_operator_KafkaConsumer_topic_partition_24)
      # TYPE flink_taskmanager_job_task_operator_KafkaConsumer_topic_partition_24_committedOffsets gauge
      flink_taskmanager_job_task_operator_KafkaConsumer_topic_partition_24_committedOffsets{task_attempt_id="9b666af68ec4734b25937b8b94cc5c84",host="localhost",task_attempt_num="0",tm_id="e72a527652f5af1358bdbc0f5bf6f49d",partition="24",topic="rt_event",job_id="546cf6f0d1f0b818afd9697c612f715c",task_id="61252f73469d3ffba207c548d29a0267",operator_id="61252f73469d3ffba207c548d29a0267",operator_name="Source:_kafka_source",task_name="Source:_kafka_source____sampling____parse_and_filter",job_name="FlinkRuleMatchPipeline",subtask_index="27",} 3.001186523E9
      

      This is a snippet from my flink prometheus reporter. It showed that kafka current offsets and committed offsets metric names changed after I migrated my flink job from 1.6.0 to 1.6.3.

      The origin metrics name should not contain partition index in metric name, i.e. the metric name should be flink_taskmanager_job_task_operator_KafkaConsumer_topic_partition_currentOffsets and flink_taskmanager_job_task_operator_KafkaConsumer_topic_partition_committedOffsets.

      After digging into the source code, I found that the incompatibility started from this PR, because it overloaded a new getLogicalScope(CharacterFilter, char, int) and didn't override in GenericValueMetricGroup class.
      When the tail metric group from a metric is GenericValueMetricGroup and this new getLogicalScope is called, i.e. calling FrontMetricGroup#getLogicalScope, the value group name will not be ignored, but it should be in previous released version.

      Attachments

        Issue Links

          Activity

            People

              tonywei Wei-Che Wei
              tonywei Wei-Che Wei
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m