Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-6376

Improve Streams metrics for skipped records

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.0.0
    • 2.0.0
    • metrics, streams

    Description

      Copy this from KIP-210 discussion thread:

      Note that currently we have two metrics for `skipped-records` on different
      levels:

      1) on the highest level, the thread-level, we have a `skipped-records`,
      that records all the skipped records due to deserialization errors.
      2) on the lower processor-node level, we have a
      `skippedDueToDeserializationError`, that records the skipped records on
      that specific source node due to deserialization errors.

      So you can see that 1) does not cover any other scenarios and can just be
      thought of as an aggregate of 2) across all the tasks' source nodes.
      However, there are other places that can cause a record to be dropped, for
      example:

      1) https://issues.apache.org/jira/browse/KAFKA-5784: records could be
      dropped due to window elapsed.
      2) KIP-210: records could be dropped on the producer side.
      3) records could be dropped during user-customized processing on errors.

      guozhang Not sure what you mean by "3) records could be dropped during user-customized processing on errors."

      Btw: we also drop record with null key and/or value for certain DSL operations. This should be included as well.

      KIP: : https://cwiki.apache.org/confluence/display/KAFKA/KIP-274%3A+Kafka+Streams+Skipped+Records+Metrics

      Attachments

        Issue Links

          Activity

            People

              vvcephei John Roesler
              mjsax Matthias J. Sax
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: