Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-8037

KTable restore may load bad data

    XMLWordPrintableJSON

Details

    Description

      If an input topic contains bad data, users can specify a `deserialization.exception.handler` to drop corrupted records on read. However, this mechanism may be by-passed on restore. Assume a `builder.table()` call reads and drops a corrupted record. If the table state is lost and restored from the changelog topic, the corrupted record may be copied into the store, because on restore plain bytes are copied.

      If the KTable is used in a join, an internal `store.get()` call to lookup the record would fail with a deserialization exception if the value part cannot be deserialized.

      GlobalKTables are affected, too (cf. KAFKA-7663 that may allow a fix for GlobalKTable case). It's unclear to me atm, how this issue could be addressed for KTables though.

      Note, that user state stores are not affected, because they always have a dedicated changelog topic (and don't reuse an input topic) and thus the corrupted record would not be written into the changelog.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              mjsax Matthias J. Sax
              Votes:
              1 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated: