Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-13320

Suggestion: SMT support for null key/value should be documented

    XMLWordPrintableJSON

Details

    • Wish
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • connect
    • None

    Description

      While working with a JDBC Sink Connector, I noticed that some SMT choke on a tombstone (null value) while others handle tombstones fine.

      For example:

      "transforms": "flattenKey,valueToJSON,wrapValue,addTimestamp", 
      
      "transforms.flattenKey.type": "org.apache.kafka.connect.transforms.Flatten$Key", "transforms.flattenKey.delimiter": "_", 
      
      "transforms.valueToJSON.type": "com.github.jcustenborder.kafka.connect.transform.common.ToJSON$Value", 
      "transforms.valueToJSON.schemas.enable": "false",
      "transforms.valueToJSON.predicate": "tombstone",
      "transforms.valueToJSON.negate": true, 
      
      "transforms.wrapValue.type":"org.apache.kafka.connect.transforms.HoistField$Value", "transforms.wrapValue.field":"matrix",
      "transforms.wrapValue.predicate": "tombstone",
      "transforms.wrapValue.negate": true,
      
      "transforms.addTimestamp.type": "org.apache.kafka.connect.transforms.InsertField$Value", 
      "transforms.addTimestamp.timestamp.field": "message_timestamp",
      
      "predicates": "tombstone",
      "predicates.tombstone.type": "org.apache.kafka.connect.transforms.predicates.RecordIsTombstone"
      

      To avoid the cryptic error “java.lang.ClassCastException: class java.util.HashMap cannot be cast to class org.apache.kafka.connect.data.Struct” when processing a tombstone record, I had to add a negated predicate of RecordIsTombstone for ToJSON (community SMT) and HoistField, but did not need to add that to InsertField.

      Digging in the source, I find that InsertField handles the case where key or value is null:
      https://github.com/a0x8o/kafka/blob/f8237749f6ad34c09154f807e53273be64e1261e/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/InsertField.java#L130

      ^ Thanks to this, there's no need to add a predicate to skip InsertField$Value when value is null.

      It would help if the docs listed how the individual SMTs behave when dealing with a null key/value.

      Of course we can always find this out by trial and error or by studying the source code.
      But if we were to make a best practice of describing how an SMT handles null key/value, that would have two benefits:
      1) Save developers time when working with the official (shipped with Kafka) SMT
      2) Inspire developers who write their own SMT to likewise document how they handle null key/value

      Perhaps a standard way of dealing with nulls ("no-op if key/value is null") could be promoted, and SMT authors would only need to document their behavior when it differs.

      Attachments

        Activity

          People

            Unassigned Unassigned
            benissimo Ben Ellis
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: