Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Cannot Reproduce
-
0.9.0.1
-
None
-
None
-
Intel x86_64
Ubuntu 14.04
Description
Hi,
I'm trying to run the test_console_consumer.py system test and it's failing while testing the SASL protocols.
[INFO - 2016-03-15 14:41:58,533 - runner - log - lineno:211]: SerialTestRunner: kafkatest.sanity_checks.test_console_consumer.ConsoleConsumerTest.test_lifecycle.security_protocol=SASL_SSL: Summary: Kafka server didn't finish startup
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/ducktape/tests/runner.py", line 102, in run_all_tests
result.data = self.run_single_test()
File "/usr/local/lib/python2.7/dist-packages/ducktape/tests/runner.py", line 154, in run_single_test
return self.current_test_context.function(self.current_test)
File "/usr/local/lib/python2.7/dist-packages/ducktape/mark/_mark.py", line 331, in wrapper
return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
File "/home/mickael/ibm/messagehub/kafka/tests/kafkatest/sanity_checks/test_console_consumer.py", line 54, in test_lifecycle
self.kafka.start()
File "/home/mickael/ibm/messagehub/kafka/tests/kafkatest/services/kafka/kafka.py", line 81, in start
Service.start(self)
File "/usr/local/lib/python2.7/dist-packages/ducktape/services/service.py", line 140, in start
self.start_node(node)
File "/home/mickael/ibm/messagehub/kafka/tests/kafkatest/services/kafka/kafka.py", line 124, in start_node
monitor.wait_until("Kafka Server.*started", timeout_sec=30, err_msg="Kafka server didn't finish startup")
File "/usr/local/lib/python2.7/dist-packages/ducktape/cluster/remoteaccount.py", line 303, in wait_until
return wait_until(lambda: self.acct.ssh("tail -c +%d %s | grep '%s'" % (self.offset+1, self.log, pattern), allow_fail=True) == 0, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/ducktape/utils/util.py", line 36, in wait_until
raise TimeoutError(err_msg)
TimeoutError: Kafka server didn't finish startup
Looking at the logs from the kafka worker, I can see that Kafka is not able to connect the the kerberos server:
[2016-03-15 14:41:28,751] FATAL [Kafka Server 1], Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
org.apache.kafka.common.KafkaException: javax.security.auth.login.LoginException: Connection refused
at org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:74)
at org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:60)
at kafka.network.Processor.<init>(SocketServer.scala:379)
at kafka.network.SocketServer$$anonfun$startup$1$$anonfun$apply$1.apply$mcVI$sp(SocketServer.scala:96)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
at kafka.network.SocketServer$$anonfun$startup$1.apply(SocketServer.scala:95)
at kafka.network.SocketServer$$anonfun$startup$1.apply(SocketServer.scala:91)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.MapLike$DefaultValuesIterable.foreach(MapLike.scala:206)
at kafka.network.SocketServer.startup(SocketServer.scala:91)
at kafka.server.KafkaServer.startup(KafkaServer.scala:179)
at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37)
at kafka.Kafka$.main(Kafka.scala:67)
at kafka.Kafka.main(Kafka.scala)
Looking at the kerberos worker, I can see it was started fine:
Standalone MiniKdc Running
---------------------------------------------------
Realm : EXAMPLE.COM
Running at : worker4:worker4
krb5conf : /mnt/minikdc/krb5.conf
created keytab : /mnt/minikdc/keytab
with principals : [client, kafka/worker2]
Do <CTRL-C> or kill <PID> to stop it
---------------------------------------------------
Running netstat on the kerberos worker, I can see that it's listening on 47385:
vagrant@worker4:~$ netstat -ano
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State Timer
tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN off (0.00/0/0)
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN off (0.00/0/0)
tcp 0 0 0.0.0.0:44313 0.0.0.0:* LISTEN off (0.00/0/0)
tcp 0 0 10.0.2.15:22 10.0.2.2:56153 ESTABLISHED keepalive (7165.86/0/0)
tcp 0 0 127.0.0.1:47747 127.0.1.1:47385 TIME_WAIT timewait (30.08/0/0)
tcp6 0 0 :::111 :::* LISTEN off (0.00/0/0)
tcp6 0 0 :::22 :::* LISTEN off (0.00/0/0)
tcp6 0 0 127.0.1.1:47385 :::* LISTEN off (0.00/0/0)
udp6 0 0 :::45368 :::* off (0.00/0/0)
From the same worker, I can connect fine to 47385 (the kerberos port):
vagrant@worker4:~$ nc -vvv worker4 47385
Connection to worker4 47385 port [tcp/*] succeeded!
But this is not working from any of the other workers.
It seems strange that kerberos is listening on the local address and not on the public one.