Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-6631

Kafka Streams - Rebalancing exception in Kafka 1.0.0

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 1.0.0
    • None
    • streams
    • None
    • Container Linux by CoreOS 1576.5.0

    Description

       
      In Kafka Streams 1.0.0, we saw a strange rebalance error, our stream app performs window based aggregations, sometimes on start when all stream workers  join the app just crash, however if we enable only one worker than it works fine, sometime 2 workers work just fine, but when third join the app crashes again, some critical issue with rebalance.

      018-03-08T18:51:01.226243000Z org.apache.kafka.common.KafkaException: Unexpected error from SyncGroup: The server experienced an unexpected error when processing the request
      2018-03-08T18:51:01.226557000Z at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:566)
      2018-03-08T18:51:01.226860000Z at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:539)
      2018-03-08T18:51:01.227328000Z at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:808)
      2018-03-08T18:51:01.227630000Z at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:788)
      2018-03-08T18:51:01.228152000Z at org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:204)
      2018-03-08T18:51:01.228449000Z at org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:167)
      2018-03-08T18:51:01.228897000Z at org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:127)
      2018-03-08T18:51:01.229196000Z at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:506)
      2018-03-08T18:51:01.229673000Z at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:353)
      2018-03-08T18:51:01.229971000Z at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:268)
      2018-03-08T18:51:01.230436000Z at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:214)
      2018-03-08T18:51:01.230749000Z at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:174)
      2018-03-08T18:51:01.231065000Z at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:364)
      2018-03-08T18:51:01.231584000Z at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:316)
      2018-03-08T18:51:01.231911000Z at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:295)
      2018-03-08T18:51:01.232190000Z at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1138)
      2018-03-08T18:51:01.232643000Z at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1103)
      2018-03-08T18:51:01.233121000Z at org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:851)
      2018-03-08T18:51:01.233409000Z at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:808)
      2018-03-08T18:51:01.233720000Z at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:774)
      2018-03-08T18:51:01.234196000Z at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:744)
      2018-03-08T18:51:01.234655000Z org.apache.kafka.common.KafkaException: Unexpected error from SyncGroup: The server experienced an unexpected error when processing the request
      2018-03-08T18:51:01.234972000Z exception in thread, closing process
      2018-03-08T18:51:01.235500000Z at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:566)
      2018-03-08T18:51:01.235839000Z at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:539)
      2018-03-08T18:51:01.236336000Z at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:808)
      2018-03-08T18:51:01.236603000Z at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:788)
      2018-03-08T18:51:01.236889000Z at org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:204)
      2018-03-08T18:51:01.237092000Z at org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:167)
      2018-03-08T18:51:01.237531000Z at org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:127)
      2018-03-08T18:51:01.237816000Z at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:506)
      2018-03-08T18:51:01.238097000Z at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:353)
      2018-03-08T18:51:01.238395000Z at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:268)
      2018-03-08T18:51:01.238698000Z at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:214)
      2018-03-08T18:51:01.239511000Z exception in thread, closing process
      2018-03-08T18:51:01.239880000Z exception in thread, closing process
      2018-03-08T18:51:01.240175000Z at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:174)
      2018-03-08T18:51:01.240443000Z at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:364)
      2018-03-08T18:51:01.240764000Z at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:316)
      2018-03-08T18:51:01.241083000Z at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:295)
      2018-03-08T18:51:01.241367000Z at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1138)
      2018-03-08T18:51:01.241789000Z at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1103)
      2018-03-08T18:51:01.242075000Z at org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:851)
      2018-03-08T18:51:01.242351000Z at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:808)
      2018-03-08T18:51:01.242641000Z at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:774)
      2018-03-08T18:51:01.243051000Z at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:744)
      

      On Taking a look further on brokers, I saw another exception:

      Appending metadata message for group AnomalyKafkaStreams generation 12 failed due to org.apache.kafka.common.errors.RecordTooLargeException, returning UNKNOWN error code to the client (kafka.coordinator.group.GroupMetadataManager)
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              alex.iv Alexander Ivanichev
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: