Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-4815 Idempotent/transactional Producer (KIP-98)
  3. KAFKA-5429

Producer IllegalStateException: Batch has already been completed

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.11.0.0
    • clients, core, producer
    • None

    Description

      I've seen this a few times in system tests:

      [2017-06-10 19:47:38,434] ERROR Uncaught error in request completion: (org.apache.kafka.clients.NetworkClient)
      java.lang.IllegalStateException: Batch has already been completed
              at org.apache.kafka.clients.producer.internals.ProducerBatch.done(ProducerBatch.java:157)
              at org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:576)
              at org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:555)
              at org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:479)
              at org.apache.kafka.clients.producer.internals.Sender.access$100(Sender.java:75)
              at org.apache.kafka.clients.producer.internals.Sender$1.onComplete(Sender.java:666)
              at org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:101)
              at org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:454)
              at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:446)
              at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:206)
              at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:162)
              at java.lang.Thread.run(Thread.java:745)
      [2
      

      I think this is probably caused by aborting in-flight batches after an error state. See the following log:

      [2017-06-10 19:47:38,425] ERROR Aborting producer batches due to fatal error (org.apache.kafka.clients.producer.internals.Sender)
      org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker received an out of order sequence number
      [2017-06-10 19:47:38,425] DEBUG [TransactionalId my-first-transactional-id] Transition from state ABORTABLE_ERROR to ABORTING_TRANSACTION (org.apache.kafka.clients.producer.internals.TransactionManager)
      [2017-06-10 19:47:38,425] TRACE Produced messages to topic-partition output-topic-0 with base offset offset -1 and error: {}. (org.apache.kafka.clients.producer.internals.ProducerBatch)
      org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker received an out of order sequence number
      [2017-06-10 19:47:38,425] DEBUG [TransactionalId my-first-transactional-id] Enqueuing transactional request (type=EndTxnRequest, transactionalId=my-first-transactional-id, producerId=2000, producerEpoch=0, result=ABORT) (org.apache.kafka.clients.producer.internals.TransactionManager)
      [2017-06-10 19:47:38,426] TRACE [TransactionalId my-first-transactional-id] Request (type=EndTxnRequest, transactionalId=my-first-transactional-id, producerId=2000, producerEpoch=0, result=ABORT) dequeued for sending (org.apache.kafka.clients.producer.internals.TransactionManager)
      [2017-06-10 19:47:38,426] DEBUG [TransactionalId my-first-transactional-id] Sending transactional request (type=EndTxnRequest, transactionalId=my-first-transactional-id, producerId=2000, producerEpoch=0, result=ABORT) to node worker11:9092 (id: 3 rack: null) (org.apache.kafka.clients.producer.internals.Sender)
      [2017-06-10 19:47:38,434] TRACE Received produce response from node 2 with correlation id 514 (org.apache.kafka.clients.producer.internals.Sender)
      [2017-06-10 19:47:38,434] DEBUG Incremented sequence number for topic-partition output-topic-0 to 4500 (org.apache.kafka.clients.producer.internals.Sender)
      [2017-06-10 19:47:38,434] TRACE Produced messages to topic-partition output-topic-0 with base offset offset 7033 and error: null. (org.apache.kafka.clients.producer.internals.ProducerBatch)
      [2017-06-10 19:47:38,434] ERROR Uncaught error in request completion: (org.apache.kafka.clients.NetworkClient)
      java.lang.IllegalStateException: Batch has already been completed
      

      A simple solution is to add a separate flag to indicate that the batch has been aborted. We can check it when the response returns and skip the callback.

      Attachments

        Issue Links

          Activity

            People

              hachikuji Jason Gustafson
              hachikuji Jason Gustafson
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: