Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-8484

ProducerId reset can cause IllegalStateException

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.3.0
    • Component/s: producer
    • Labels:
      None

      Description

      If the producerId is reset while inflight requests are pending, we can get the follow uncaught error.

      [2019-06-03 08:20:45,320] ERROR [Producer clientId=producer-1] Uncaught error in request completion: (org.apache.kafka.clients.NetworkClient)                                                                                                               
      java.lang.IllegalStateException: Sequence number for partition test_topic-13 is going to become negative : -965
              at org.apache.kafka.clients.producer.internals.TransactionManager.adjustSequencesDueToFailedBatch(TransactionManager.java:561)                                                                                                                      
              at org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:744)
              at org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:717)
              at org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:667)
              at org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:574)
              at org.apache.kafka.clients.producer.internals.Sender.access$100(Sender.java:75)
              at org.apache.kafka.clients.producer.internals.Sender$1.onComplete(Sender.java:818)
              at org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:109)
              at org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:561)
              at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:553)
              at org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:335)
              at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:253)
              at java.lang.Thread.run(Thread.java:748)
      

      The impact of this is that a failed batch will not be completed until the delivery timeout is exceeded. We are missing validation when we receive a produce response that the producerId and epoch still match.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                hachikuji Jason Gustafson
                Reporter:
                hachikuji Jason Gustafson
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: