Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Cannot Reproduce
-
0.10.1.0
-
kafka@86a156fd9dda:~$ java -version
java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)
kafka@86a156fd9dda:~$ uname -a
Linux 86a156fd9dda 4.7.3-coreos-r2 #1 SMP Tue Nov 1 01:38:43 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
kafka@86a156fd9dda:~$ ps alx | grep java
4 1000 1 0 20 0 75887304 3820220 futex_ Ssl ? 9379:49 /usr/lib/jvm/java-8-oracle/bin/java -Xmx3G -Xms3G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC -Djava.awt.headless=true -Xloggc:/kafka/bin/../logs/kafkaServer-gc.log -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=10.60.68.122 -Dcom.sun.management.jmxremote.rmi.port=7203 -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7203 -Dkafka.logs.dir=/kafka/bin/../logs -Dlog4j.configuration=file:/kafka/bin/../config/log4j.properties -cp :/kafka/bin/../libs/aopalliance-repackaged-2.4.0-b34.jar:/kafka/bin/../libs/argparse4j-0.5.0.jar:/kafka/bin/../libs/connect-api-0.10.1.0.jar:/kafka/bin/../libs/connect-file-0.10.1.0.jar:/kafka/bin/../libs/connect-json-0.10.1.0.jar:/kafka/bin/../libs/connect-runtime-0.10.1.0.jar:/kafka/bin/../libs/guava-18.0.jar:/kafka/bin/../libs/hk2-api-2.4.0-b34.jar:/kafka/bin/../libs/hk2-locator-2.4.0-b34.jar:/kafka/bin/../libs/hk2-utils-2.4.0-b34.jar:/kafka/bin/../libs/jackson-annotations-2.6.0.jar:/kafka/bin/../libs/jackson-core-2.6.3.jar:/kafka/bin/../libs/jackson-databind-2.6.3.jar:/kafka/bin/../libs/jackson-jaxrs-base-2.6.3.jar:/kafka/bin/../libs/jackson-jaxrs-json-provider-2.6.3.jar:/kafka/bin/../libs/jackson-module-jaxb-annotations-2.6.3.jar:/kafka/bin/../libs/javassist-3.18.2-GA.jar:/kafka/bin/../libs/javax.annotation-api-1.2.jar:/kafka/bin/../libs/javax.inject-1.jar:/kafka/bin/../libs/javax.inject-2.4.0-b34.jar:/kafka/bin/../libs/javax.servlet-api-3.1.0.jar:/kafka/bin/../libs/javax.ws.rs-api-2.0.1.jar:/kafka/bin/../libs/jersey-client-2.22.2.jar:/kafka/bin/../libs/jersey-common-2.22.2.jar:/kafka/bin/../libs/jersey-container-servlet-2.22.2.jar:/kafka/bin/../libs/jersey-container-servlet-core-2.22.2.jar:/kafka/bin/../libs/jersey-guava-2.22.2.jar:/kafka/bin/../libs/jersey-media-jaxb-2.22.2.jar:/kafka/bin/../libs/jersey-server-2.22.2.jar:/kafka/bin/../libs/jetty-continuation-9.2.15.v20160210.jar:/kafka/bin/../libs/jetty-http-9.2.15.v20160210.jar:/kafka/bin/../libs/jetty-io-9.2.15.v20160210.jar:/kafka/bin/../libs/jetty-security-9.2.15.v20160210.jar:/kafka/bin/../libs/jetty-server-9.2.15.v20160210.jar:/kafka/bin/../libs/jetty-servlet-9.2.15.v20160210.jar:/kafka/bin/../libs/jetty-servlets-9.2.15.v20160210.jar:/kafka/bin/../libs/jetty-util-9.2.15.v20160210.jar:/kafka/bin/../libs/jopt-simple-4.9.jar:/kafka/bin/../libs/kafka-clients-0.10.1.0.jar:/kafka/bin/../libs/kafka-log4j-appender-0.10.1.0.jar:/kafka/bin/../libs/kafka-streams-0.10.1.0.jar:/kafka/bin/../libs/kafka-streams-examples-0.10.1.0.jar:/kafka/bin/../libs/kafka-tools-0.10.1.0.jar:/kafka/bin/../libs/kafka_2.11-0.10.1.0-sources.jar:/kafka/bin/../libs/kafka_2.11-0.10.1.0-test-sources.jar:/kafka/bin/../libs/kafka_2.11-0.10.1.0.jar:/kafka/bin/../libs/log4j-1.2.17.jar:/kafka/bin/../libs/lz4-1.3.0.jar:/kafka/bin/../libs/metrics-core-2.2.0.jar:/kafka/bin/../libs/osgi-resource-locator-1.0.1.jar:/kafka/bin/../libs/reflections-0.9.10.jar:/kafka/bin/../libs/rocksdbjni-4.9.0.jar:/kafka/bin/../libs/scala-library-2.11.8.jar:/kafka/bin/../libs/scala-parser-combinators_2.11-1.0.4.jar:/kafka/bin/../libs/slf4j-api-1.7.21.jar:/kafka/bin/../libs/slf4j-log4j12-1.7.21.jar:/kafka/bin/../libs/snappy-java-1.1.2.6.jar:/kafka/bin/../libs/validation-api-1.1.0.Final.jar:/kafka/bin/../libs/zkclient-0.9.jar:/kafka/bin/../libs/zookeeper-3.4.8.jar kafka.Kafka /kafka/config/server.properties
This is running inside a docker container.kafka@86a156fd9dda :~$ java -version java version "1.8.0_60" Java(TM) SE Runtime Environment (build 1.8.0_60-b27) Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode) kafka@86a156fd9dda :~$ uname -a Linux 86a156fd9dda 4.7.3-coreos-r2 #1 SMP Tue Nov 1 01:38:43 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux kafka@86a156fd9dda :~$ ps alx | grep java 4 1000 1 0 20 0 75887304 3820220 futex_ Ssl ? 9379:49 /usr/lib/jvm/java-8-oracle/bin/java -Xmx3G -Xms3G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC -Djava.awt.headless=true -Xloggc:/kafka/bin/../logs/kafkaServer-gc.log -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=10.60.68.122 -Dcom.sun.management.jmxremote.rmi.port=7203 -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7203 -Dkafka.logs.dir=/kafka/bin/../logs -Dlog4j.configuration=file:/kafka/bin/../config/log4j.properties -cp :/kafka/bin/../libs/aopalliance-repackaged-2.4.0-b34.jar:/kafka/bin/../libs/argparse4j-0.5.0.jar:/kafka/bin/../libs/connect-api-0.10.1.0.jar:/kafka/bin/../libs/connect-file-0.10.1.0.jar:/kafka/bin/../libs/connect-json-0.10.1.0.jar:/kafka/bin/../libs/connect-runtime-0.10.1.0.jar:/kafka/bin/../libs/guava-18.0.jar:/kafka/bin/../libs/hk2-api-2.4.0-b34.jar:/kafka/bin/../libs/hk2-locator-2.4.0-b34.jar:/kafka/bin/../libs/hk2-utils-2.4.0-b34.jar:/kafka/bin/../libs/jackson-annotations-2.6.0.jar:/kafka/bin/../libs/jackson-core-2.6.3.jar:/kafka/bin/../libs/jackson-databind-2.6.3.jar:/kafka/bin/../libs/jackson-jaxrs-base-2.6.3.jar:/kafka/bin/../libs/jackson-jaxrs-json-provider-2.6.3.jar:/kafka/bin/../libs/jackson-module-jaxb-annotations-2.6.3.jar:/kafka/bin/../libs/javassist-3.18.2-GA.jar:/kafka/bin/../libs/javax.annotation-api-1.2.jar:/kafka/bin/../libs/javax.inject-1.jar:/kafka/bin/../libs/javax.inject-2.4.0-b34.jar:/kafka/bin/../libs/javax.servlet-api-3.1.0.jar:/kafka/bin/../libs/javax.ws.rs-api-2.0.1.jar:/kafka/bin/../libs/jersey-client-2.22.2.jar:/kafka/bin/../libs/jersey-common-2.22.2.jar:/kafka/bin/../libs/jersey-container-servlet-2.22.2.jar:/kafka/bin/../libs/jersey-container-servlet-core-2.22.2.jar:/kafka/bin/../libs/jersey-guava-2.22.2.jar:/kafka/bin/../libs/jersey-media-jaxb-2.22.2.jar:/kafka/bin/../libs/jersey-server-2.22.2.jar:/kafka/bin/../libs/jetty-continuation-9.2.15.v20160210.jar:/kafka/bin/../libs/jetty-http-9.2.15.v20160210.jar:/kafka/bin/../libs/jetty-io-9.2.15.v20160210.jar:/kafka/bin/../libs/jetty-security-9.2.15.v20160210.jar:/kafka/bin/../libs/jetty-server-9.2.15.v20160210.jar:/kafka/bin/../libs/jetty-servlet-9.2.15.v20160210.jar:/kafka/bin/../libs/jetty-servlets-9.2.15.v20160210.jar:/kafka/bin/../libs/jetty-util-9.2.15.v20160210.jar:/kafka/bin/../libs/jopt-simple-4.9.jar:/kafka/bin/../libs/kafka-clients-0.10.1.0.jar:/kafka/bin/../libs/kafka-log4j-appender-0.10.1.0.jar:/kafka/bin/../libs/kafka-streams-0.10.1.0.jar:/kafka/bin/../libs/kafka-streams-examples-0.10.1.0.jar:/kafka/bin/../libs/kafka-tools-0.10.1.0.jar:/kafka/bin/../libs/kafka_2.11-0.10.1.0-sources.jar:/kafka/bin/../libs/kafka_2.11-0.10.1.0-test-sources.jar:/kafka/bin/../libs/kafka_2.11-0.10.1.0.jar:/kafka/bin/../libs/log4j-1.2.17.jar:/kafka/bin/../libs/lz4-1.3.0.jar:/kafka/bin/../libs/metrics-core-2.2.0.jar:/kafka/bin/../libs/osgi-resource-locator-1.0.1.jar:/kafka/bin/../libs/reflections-0.9.10.jar:/kafka/bin/../libs/rocksdbjni-4.9.0.jar:/kafka/bin/../libs/scala-library-2.11.8.jar:/kafka/bin/../libs/scala-parser-combinators_2.11-1.0.4.jar:/kafka/bin/../libs/slf4j-api-1.7.21.jar:/kafka/bin/../libs/slf4j-log4j12-1.7.21.jar:/kafka/bin/../libs/snappy-java-1.1.2.6.jar:/kafka/bin/../libs/validation-api-1.1.0.Final.jar:/kafka/bin/../libs/zkclient-0.9.jar:/kafka/bin/../libs/zookeeper-3.4.8.jar kafka.Kafka /kafka/config/server.properties This is running inside a docker container.
Description
My cluster is unable to communicate to one of my brokers (Broker 1 in this case) and is spinning on logs:
[2016-11-29 19:05:08,659] WARN [ReplicaFetcherThread-0-1], Error in fetch kafka.server.ReplicaFetcherThread$FetchRequest@27aeb5f4 (kafka.server.ReplicaFetcherThread)
java.io.IOException: Connection to 10.60.68.122:9092 (id: 1 rack: null) failed
at kafka.utils.NetworkClientBlockingOps$.awaitReady$1(NetworkClientBlockingOps.scala:83)
at kafka.utils.NetworkClientBlockingOps$.blockingReady$extension(NetworkClientBlockingOps.scala:93)
at kafka.server.ReplicaFetcherThread.sendRequest(ReplicaFetcherThread.scala:248)
at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:238)
at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:42)
at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:118)
at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:103)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
and on Broker 1 I tried issuing a shutdown and I get this:
[2016-11-29 18:33:16,152] INFO [Kafka Server 1], shutting down (kafka.server.KafkaServer) [2016-11-29 18:33:16,163] INFO [Kafka Server 1], Starting controlled shutdown (kafka.server.KafkaServer) [2016-11-29 18:33:38,347] INFO [Kafka Server 1], Remaining partitions to move: LONG_LIST_OF_TOPICS [2016-11-29 18:33:38,350] INFO [Kafka Server 1], Error code from controller: 0 (kafka.server.KafkaServer) [2016-11-29 18:33:43,356] WARN [Kafka Server 1], Retrying controlled shutdown after the previous attempt failed... (kafka.server.KafkaServer) [2016-11-29 18:34:04,053] INFO [Kafka Server 1], Remaining partitions to move: SAME_LONG_LIST_OF_TOPICS_AGAIN [2016-11-29 18:34:04,053] INFO [Kafka Server 1], Error code from controller: 0 (kafka.server.KafkaServer) [2016-11-29 18:34:09,054] WARN [Kafka Server 1], Retrying controlled shutdown after the previous attempt failed... (kafka.server.KafkaServer) [2016-11-29 18:34:32,577] INFO [Kafka Server 1], Remaining partitions to move: SAM_LONG_LIST_OF_TOPICS_AGAIN_AGAIN [2016-11-29 18:34:32,578] INFO [Kafka Server 1], Error code from controller: 0 (kafka.server.KafkaServer) [2016-11-29 18:34:37,579] WARN [Kafka Server 1], Retrying controlled shutdown after the previous attempt failed... (kafka.server.KafkaServer) [2016-11-29 18:34:37,586] WARN [Kafka Server 1], Proceeding to do an unclean shutdown as all the controlled shutdown attempts failed (kafka.server.KafkaServer) [2016-11-29 18:34:37,612] INFO [Socket Server on Broker 1], Shutting down (kafka.network.SocketServer) [2016-11-29 18:42:36,940] INFO Rolled new log segment for '__consumer_offsets-30' in 6 ms. (kafka.log.Log) [2016-11-29 18:43:52,440] INFO Deleting segment 71712593 from log __consumer_offsets-30. (kafka.log.Log) [2016-11-29 18:43:52,440] INFO Deleting segment 0 from log __consumer_offsets-30. (kafka.log.Log) [2016-11-29 18:43:52,492] INFO Deleting index /data/__consumer_offsets-30/00000000000071712593.index.deleted (kafka.log.OffsetIndex) [2016-11-29 18:43:52,532] INFO Deleting index /data/__consumer_offsets-30/00000000000000000000.index.deleted (kafka.log.OffsetIndex) [2016-11-29 18:43:52,532] INFO Deleting index /data/__consumer_offsets-30/00000000000000000000.timeindex.deleted (kafka.log.TimeIndex) [2016-11-29 18:43:52,549] INFO Deleting index /data/__consumer_offsets-30/00000000000071712593.timeindex.deleted (kafka.log.TimeIndex) [2016-11-29 18:43:53,370] INFO Deleting segment 72483593 from log __consumer_offsets-30. (kafka.log.Log) [2016-11-29 18:43:53,478] INFO Deleting index /data/__consumer_offsets-30/00000000000072483593.index.deleted (kafka.log.OffsetIndex) [2016-11-29 18:43:53,479] INFO Deleting index /data/__consumer_offsets-30/00000000000072483593.timeindex.deleted (kafka.log.TimeIndex)
so it says it's doing an unclean shutdown but then it refuses to stop the process. now I have this sort of zombie process and the other brokers are spinning even faster on trying to connect to it.
What other logs can I provide to help debug this broker's failure?