Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-15214

Add metrics for OffsetOutOfRangeException when tiered storage is enabled

    XMLWordPrintableJSON

Details

    • Task
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 3.6.0
    • 4.0.0
    • metrics

    Description

      In the current metrics RemoteReadErrorsPerSec, the exception type OffsetOutOfRangeException is not included.

      In our testing with tiered storage feature (at Apple), we noticed several cases where remote download is affected and stuck due to repeatedly OffsetOutOfRangeException in some particular broker or topic partitions. The root cause could be various but currently without a metrics it's very hard to catch this issue and debug in a timely fashion. It's understandable that the exception itself could not be the root cause but this exception metric could be a good metrics for us to alert and investigate.

      Related discussion
      https://github.com/apache/kafka/pull/13944#discussion_r1266243006

      I am happy to contribute to this if the request is agreed.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              yaolixin Lixin Yao
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated: