Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-10635

Streams application fails with OutOfOrderSequenceException after rolling restarts of brokers

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Blocker
    • Resolution: Unresolved
    • 2.5.1
    • None
    • core, producer
    • None

    Description

      We are upgrading our brokers to version 2.5.1 (from 2.3.1) by performing a rolling restart of the brokers after installing the new version. After the restarts we notice one of our streams app (client version 2.4.1) fails with OutOfOrderSequenceException:

       

      ERROR [2020-10-13 22:52:21,400] [com.aaa.bbb.ExceptionHandler] Unexpected error. Record: a_record, destination topic: topic-name-Aggregation-repartition org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker received an out of order sequence number.
      ERROR [2020-10-13 22:52:21,413] [org.apache.kafka.streams.processor.internals.AssignedTasks] stream-thread [topic-name-StreamThread-1] Failed to commit stream task 1_39 due to the following error: org.apache.kafka.streams.errors.StreamsException: task [1_39] Abort sending since an error caught with a previous record (timestamp 1602654659000) to topic topic-name-Aggregation-repartition due to org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker received an out of order sequence number.        at org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:144)        at org.apache.kafka.streams.processor.internals.RecordCollectorImpl.access$500(RecordCollectorImpl.java:52)        at org.apache.kafka.streams.processor.internals.RecordCollectorImpl$1.onCompletion(RecordCollectorImpl.java:204)        at org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1348)        at org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:230)        at org.apache.kafka.clients.producer.internals.ProducerBatch.done(ProducerBatch.java:196)        at org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:730)        at org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:716)        at org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:674)        at org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:596)        at org.apache.kafka.clients.producer.internals.Sender.access$100(Sender.java:74)        at org.apache.kafka.clients.producer.internals.Sender$1.onComplete(Sender.java:798)        at org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:109)        at org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:569)        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:561)        at org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:335)        at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:244)        at java.base/java.lang.Thread.run(Thread.java:834)Caused by: org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker received an out of order sequence number.
      

      We see a corresponding error on the broker side:

      [2020-10-13 22:52:21,398] ERROR [ReplicaManager broker=137636348] Error processing append operation on partition topic-name-Aggregation-repartition-52  (kafka.server.ReplicaManager)org.apache.kafka.common.errors.OutOfOrderSequenceException: Out of order sequence number for producerId 2819098 at offset 1156041 in partition topic-name-Aggregation-repartition-52: 29 (incoming seq. number), -1 (current end sequence number)
      

      We are able to reproduce this many times and it happens regardless of whether the broker shutdown (at restart) is clean or unclean. However, when we rollback the broker version to 2.3.1 from 2.5.1 and perform similar rolling restarts, we don't see this error on the streams application at all. This is blocking us from upgrading our broker version. 

       

      Attachments

        1. logs.csv
          8 kB
          Nicholas Telford

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ascii80 Peeraya Maetasatidsuk
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated: