Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-15656

Frequent INVALID_RECORD on Kafka 3.6

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 3.6.0
    • None
    • producer
    • None

    Description

      Using this docker-compose.yml:

      version: "3.7"
      services:
        kafka:
          image: bitnami/kafka:latest
          network_mode: host
          environment:
            KAFKA_ENABLE_KRAFT: yes
            KAFKA_CFG_PROCESS_ROLES: controller,broker
            KAFKA_CFG_CONTROLLER_LISTENER_NAMES: CONTROLLER
            KAFKA_CFG_LISTENERS: PLAINTEXT://:9092,CONTROLLER://:9093
            KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP: CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT
            KAFKA_CFG_CONTROLLER_QUORUM_VOTERS: 1@127.0.0.1:9093
            # Set this to "PLAINTEXT://127.0.0.1:9092" if you want to run this container on localhost via Docker
            KAFKA_CFG_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
            KAFKA_CFG_NODE_ID: 1
            ALLOW_PLAINTEXT_LISTENER: yes
            KAFKA_KRAFT_CLUSTER_ID: XkpGZQ27R3eTl3OdTm2LYA # 16 byte base64-encoded UUID

      And running franz-go integration tests with KGO_TEST_RF=1, I consistently receive INVALID_RECORD errors.

       

      Looking at the container logs, I see these problematic log lines:

      2023-10-19 23:33:47,942] ERROR [ReplicaManager broker=1] Error processing append operation on partition 0cf2f3faaafd3f906ea848b684b04833ca162bcd19ecae2cab36767a54f248c7-0 (kafka.server.ReplicaManager) 
      org.apache.kafka.common.InvalidRecordException: Invalid negative header key size -25
      [2023-10-19 23:33:47,942] ERROR [ReplicaManager broker=1] Error processing append operation on partition 0cf2f3faaafd3f906ea848b684b04833ca162bcd19ecae2cab36767a54f248c7-6 (kafka.server.ReplicaManager) 
      org.apache.kafka.common.InvalidRecordException: Reached end of input stream before skipping all bytes. Remaining bytes:94
      [2023-10-19 23:33:47,942] ERROR [ReplicaManager broker=1] Error processing append operation on partition 0cf2f3faaafd3f906ea848b684b04833ca162bcd19ecae2cab36767a54f248c7-1 (kafka.server.ReplicaManager) 
      org.apache.kafka.common.InvalidRecordException: Found invalid number of record headers -26
      [2023-10-19 23:33:47,948] ERROR [ReplicaManager broker=1] Error processing append operation on partition 0cf2f3faaafd3f906ea848b684b04833ca162bcd19ecae2cab36767a54f248c7-6 (kafka.server.ReplicaManager) 
      org.apache.kafka.common.InvalidRecordException: Found invalid number of record headers -27
      [2023-10-19 23:33:47,950] ERROR [ReplicaManager broker=1] Error processing append operation on partition 0cf2f3faaafd3f906ea848b684b04833ca162bcd19ecae2cab36767a54f248c7-22 (kafka.server.ReplicaManager)
      org.apache.kafka.common.InvalidRecordException: Invalid negative header key size -25
      [2023-10-19 23:33:47,947] ERROR [ReplicaManager broker=1] Error processing append operation on partition c63b6e30987317fad18815effb8d432b6df677d2ab56cf6da517bb93fa49b74b-25 (kafka.server.ReplicaManager)
      org.apache.kafka.common.InvalidRecordException: Found invalid number of record headers -50
      [2023-10-19 23:33:47,959] ERROR [ReplicaManager broker=1] Error processing append operation on partition c63b6e30987317fad18815effb8d432b6df677d2ab56cf6da517bb93fa49b74b-25 (kafka.server.ReplicaManager) 
       

       

      I modified franz-go with a diff to print the request that was written to the wire once this error occurs. Attached is a v9 produce request. I deserialized it locally and am not seeing the corrupt data that Kafka is printing. It's possible there is a bug in the client, but again, these tests have never received this error pre-Kafka 3.6. It looks like there is either corruption when processing the incoming data, or there is some problematic race condition in the broker - I'm not sure which.

      Attachments

        1. invalid_record.log
          2 kB
          Travis Bischel

        Activity

          People

            Unassigned Unassigned
            twmb Travis Bischel
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: