Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-9620

Task revocation failure could introduce remaining unclean tasks

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.6.0
    • Component/s: None
    • Labels:
      None

      Description

      The task revocation call should enforce the close of a task, otherwise we could potentially hit the exception during `handleAssignment`.

      During revoke we failed:

       

      [2020-02-27T11:05:48-08:00] (streams-soak-trunk-eos_soak_i-099dc04bf946ce2f0_streamslog) [2020-02-27 19:05:47,321] ERROR [stream-soak-test-d1c291a8-ee54-4058-ac9c-7cd46d5484de-StreamThread-1] [Consumer clientId=stream-soak-test-d1c291a8-ee54-4058-ac9c-7cd46d5484de-StreamThread-1-consumer, groupId=stream-soak-test] User provided listener org.apache.kafka.streams.processor.internals.StreamsRebalanceListener failed on invocation of onPartitionsRevoked for partitions [logs.json.kafka-2, logs.json.zookeeper-2, node-name-repartition-1, logs.kubernetes-2, windowed-node-counts-1, logs.operator-2, logs.syslog-2] (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
      [2020-02-27T11:05:48-08:00] (streams-soak-trunk-eos_soak_i-099dc04bf946ce2f0_streamslog) org.apache.kafka.streams.errors.TaskMigratedException: Producer get fenced trying to commit a transaction; it means all tasks belonging to this thread should be migrated.
              at org.apache.kafka.streams.processor.internals.StreamsProducer.commitTransaction(StreamsProducer.java:172)
              at org.apache.kafka.streams.processor.internals.RecordCollectorImpl.commit(RecordCollectorImpl.java:226)
              at org.apache.kafka.streams.processor.internals.StreamTask.commitState(StreamTask.java:368)
              at org.apache.kafka.streams.processor.internals.StreamTask.suspend(StreamTask.java:242)
              at org.apache.kafka.streams.processor.internals.TaskManager.handleRevocation(TaskManager.java:314)
              at org.apache.kafka.streams.processor.internals.StreamsRebalanceListener.onPartitionsRevoked(StreamsRebalanceListener.java:72)
              at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.invokePartitionsRevoked(ConsumerCoordinator.java:297)
              at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:383)
              at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:439)
              at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:358)
              at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:477)
              at org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1277)
              at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1243)
              at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1218)
              at org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:920)
              at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:800)
              at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:749)
              at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:725)
      [2020-02-27T11:05:48-08:00] (streams-soak-trunk-eos_soak_i-099dc04bf946ce2f0_streamslog) Caused by: org.apache.kafka.common.errors.ProducerFencedException: Producer attempted an operation with an old epoch. Either there is a newer producer with the same transactionalId, or the producer's transaction has been expired by the broker.
      

      During assignment we are checking the cleanness of task close and throw fatal:

      [2020-02-27T11:05:48-08:00] (streams-soak-trunk-eos_soak_i-099dc04bf946ce2f0_streamslog) [2020-02-27 19:05:48,032] ERROR [stream-soak-test-d1c291a8-ee54-4058-ac9c-7cd46d5484de-StreamThread-1] stream-thread [stream-soak-test-d1c291a8-ee54-4058-ac9c-7cd46d5484de-StreamThread-1] Encountered the following exception during processing and the thread is going to shut down:  (org.apache.kafka.streams.processor.internals.StreamThread) [2020-02-27T11:05:48-08:00] (streams-soak-trunk-eos_soak_i-099dc04bf946ce2f0_streamslog) java.lang.RuntimeException: Unexpected failure to close 1 task(s) [[0_2]]. First exception (for task 0_2) follows.         at org.apache.kafka.streams.processor.internals.TaskManager.handleAssignment(TaskManager.java:205)         at org.apache.kafka.streams.processor.internals.StreamsPartitionAssignor.onAssignment(StreamsPartitionAssignor.java:1176)         at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:397)         at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:439)         at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:358)         at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:477)         at org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1277)         at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1243)         at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1218)         at org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:920)         at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:800)         at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:749)         at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:725) [2020-02-27T11:05:48-08:00] (streams-soak-trunk-eos_soak_i-099dc04bf946ce2f0_streamslog) Caused by: org.apache.kafka.streams.errors.TaskMigratedException: Producer get fenced trying to commit a transaction; it means all tasks belonging to this thread should be migrated.
      

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                bchen225242 Boyang Chen
                Reporter:
                bchen225242 Boyang Chen
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: