Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-8484

ProducerId reset can cause IllegalStateException

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.3.0
    • producer
    • None

    Description

      If the producerId is reset while inflight requests are pending, we can get the follow uncaught error.

      [2019-06-03 08:20:45,320] ERROR [Producer clientId=producer-1] Uncaught error in request completion: (org.apache.kafka.clients.NetworkClient)                                                                                                               
      java.lang.IllegalStateException: Sequence number for partition test_topic-13 is going to become negative : -965
              at org.apache.kafka.clients.producer.internals.TransactionManager.adjustSequencesDueToFailedBatch(TransactionManager.java:561)                                                                                                                      
              at org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:744)
              at org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:717)
              at org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:667)
              at org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:574)
              at org.apache.kafka.clients.producer.internals.Sender.access$100(Sender.java:75)
              at org.apache.kafka.clients.producer.internals.Sender$1.onComplete(Sender.java:818)
              at org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:109)
              at org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:561)
              at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:553)
              at org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:335)
              at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:253)
              at java.lang.Thread.run(Thread.java:748)
      

      The impact of this is that a failed batch will not be completed until the delivery timeout is exceeded. We are missing validation when we receive a produce response that the producerId and epoch still match.

      Attachments

        Issue Links

          Activity

            People

              hachikuji Jason Gustafson
              hachikuji Jason Gustafson
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: