Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-6366

StackOverflowError in kafka-coordinator-heartbeat-thread

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.0.0
    • 1.0.1
    • consumer
    • None

    Description

      With Kafka 1.0 our consumer groups fall into a permanent cycle of rebalancing once a StackOverflowError in the heartbeat thread occurred due to connectivity issues of the consumers to the coordinating broker:

      Immediately before the exception there are hundreds, if not thousands of log entries of following type:

      2017-12-12 16:23:12.361 [kafka-coordinator-heartbeat-thread |
      my-consumer-group] INFO - [Consumer clientId=consumer-4,
      groupId=my-consumer-group] Marking the coordinator <IP>:<Port> (id:
      2147483645 rack: null) dead

      The exceptions always happen somewhere in the DateFormat code, even
      though at different lines.

      2017-12-12 16:23:12.363 [kafka-coordinator-heartbeat-thread |
      my-consumer-group] ERROR - Uncaught exception in thread
      'kafka-coordinator-heartbeat-thread | my-consumer-group':
      java.lang.StackOverflowError
      at
      java.text.DateFormatSymbols.getProviderInstance(DateFormatSymbols.java:362)
      at
      java.text.DateFormatSymbols.getInstance(DateFormatSymbols.java:340)
      at java.util.Calendar.getDisplayName(Calendar.java:2110)
      at java.text.SimpleDateFormat.subFormat(SimpleDateFormat.java:1125)
      at java.text.SimpleDateFormat.format(SimpleDateFormat.java:966)
      at java.text.SimpleDateFormat.format(SimpleDateFormat.java:936)
      at java.text.DateFormat.format(DateFormat.java:345)
      at
      org.apache.log4j.helpers.PatternParser$DatePatternConverter.convert(PatternParser.java:443)
      at
      org.apache.log4j.helpers.PatternConverter.format(PatternConverter.java:65)
      at org.apache.log4j.PatternLayout.format(PatternLayout.java:506)
      at
      org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:310)
      at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
      at
      org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
      at
      org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
      at org.apache.log4j.Category.callAppenders(Category.java:206)
      at org.apache.log4j.Category.forcedLog(Category.java:391)
      at org.apache.log4j.Category.log(Category.java:856)
      at
      org.slf4j.impl.Log4jLoggerAdapter.info(Log4jLoggerAdapter.java:324)
      at
      org.apache.kafka.common.utils.LogContext$KafkaLogger.info(LogContext.java:341)
      at
      org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead(AbstractCoordinator.java:649)
      at
      org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onFailure(AbstractCoordinator.java:797)
      at
      org.apache.kafka.clients.consumer.internals.RequestFuture$1.onFailure(RequestFuture.java:209)
      at
      org.apache.kafka.clients.consumer.internals.RequestFuture.fireFailure(RequestFuture.java:177)
      at
      org.apache.kafka.clients.consumer.internals.RequestFuture.raise(RequestFuture.java:147)
      at
      org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:496)
      ...
      the following 9 lines are repeated around hundred times.
      ...
      at
      org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:496)
      at
      org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:353)
      at
      org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.failUnsentRequests(ConsumerNetworkClient.java:416)
      at
      org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.disconnect(ConsumerNetworkClient.java:388)
      at
      org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead(AbstractCoordinator.java:653)
      at
      org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onFailure(AbstractCoordinator.java:797)
      at
      org.apache.kafka.clients.consumer.internals.RequestFuture$1.onFailure(RequestFuture.java:209)
      at
      org.apache.kafka.clients.consumer.internals.RequestFuture.fireFailure(RequestFuture.java:177)
      at
      org.apache.kafka.clients.consumer.internals.RequestFuture.raise(RequestFuture.java:147)
      at
      org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:496)

      Attachments

        1. 6366.v1.txt
          2 kB
          Ted Yu
        2. ConverterProcessor_DEBUG.zip
          3.51 MB
          Joerg Heinicke
        3. ConverterProcessor.zip
          70 kB
          Joerg Heinicke
        4. Screenshot-2017-12-19 21.35-22.10 processing.png
          16 kB
          Joerg Heinicke

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            hachikuji Jason Gustafson
            joerg.heinicke Joerg Heinicke
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment