Details
-
Bug
-
Status: Done
-
Major
-
Resolution: Done
-
None
-
None
-
None
Description
In a 12-node deployment, it is seen post-kerberization that some of the topologies fail to start with exceptions thrown like one pasted below. This is because the node does not have the proper credentials, since kinit has not occurred on this host.
2017-09-11 16:04:46.923 o.a.h.i.Client [WARN] Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS ini tiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] 2017-09-11 16:04:46.930 o.a.m.p.GrokParser [ERROR] Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initi ate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local hos t is: "metron-2/xx.xx.xx.xx; destination host is: "metron-2.openstacklocal":8020; java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSExcep tion: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "nat-r7-lqys-metron-2/17 2.22.104.43"; destination host is: "nat-r7-lqys-metron-2.openstacklocal":8020; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:776) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client.call(Client.java:1480) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client.call(Client.java:1407) ~[stormjar.jar:?] at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) ~[stormjar.jar:?] at com.sun.proxy.$Proxy45.getFileInfo(Unknown Source) ~[?:?] at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771) ~[stor mjar.jar:?] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_141] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_141] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_141] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_141] at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) ~[stormjar.jar:?] at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) ~[stormjar.jar:?] at com.sun.proxy.$Proxy46.getFileInfo(Unknown Source) ~[?:?] at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2116) ~[stormjar.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305) ~[stormjar.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) ~[stormjar.jar:?] at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[stormjar.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317) ~[stormjar.jar:?] at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424) ~[stormjar.jar:?] at org.apache.metron.parsers.GrokParser.openInputStream(GrokParser.java:83) ~[stormjar.jar:?] at org.apache.metron.parsers.GrokParser.init(GrokParser.java:94) [stormjar.jar:?] at org.apache.metron.parsers.bolt.ParserBolt.prepare(ParserBolt.java:108) [stormjar.jar:?] at org.apache.storm.daemon.executor$fn__6573$fn__6586.invoke(executor.clj:798) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37] at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:482) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37] at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141] Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided ( Mechanism level: Failed to find any Kerberos tgt)] at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:682) ~[stormjar.jar:?] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_141] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_141] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:645) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:732) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client.getConnection(Client.java:1529) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client.call(Client.java:1446) ~[stormjar.jar:?] ... 24 more Caused by: javax.security.sasl.SaslException: GSS initiate failed at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) ~[?:1.8.0_141] at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:414) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:555) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:370) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:724) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:720) ~[stormjar.jar:?] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_141] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_141] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:720) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client.getConnection(Client.java:1529) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client.call(Client.java:1446) ~[stormjar.jar:?] ... 24 more Caused by: org.ietf.jgss.GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) ~[?:1.8.0_141] at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122) ~[?:1.8.0_141] at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187) ~[?:1.8.0_141] at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224) ~[?:1.8.0_141] at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) ~[?:1.8.0_141] at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) ~[?:1.8.0_141] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192) ~[?:1.8.0_141] at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:414) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:555) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:370) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:724) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:720) ~[stormjar.jar:?] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_141] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_141] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:720) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client.getConnection(Client.java:1529) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client.call(Client.java:1446) ~[stormjar.jar:?] ... 24 more 2017-09-11 16:04:46.935 o.a.s.util [ERROR] Async loop died! java.lang.RuntimeException: Grok parser Error: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is : "metron-2/xx.xx.xx.xx"; destination host is: "metron-2.openstacklocal":8020; at org.apache.metron.parsers.GrokParser.init(GrokParser.java:123) ~[stormjar.jar:?] at org.apache.metron.parsers.bolt.ParserBolt.prepare(ParserBolt.java:108) ~[stormjar.jar:?] at org.apache.storm.daemon.executor$fn__6573$fn__6586.invoke(executor.clj:798) ~[storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37] at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:482) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37] at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141] Caused by: java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "metron-2/xx.xx.xx.xx"; destination host is: "metron-2.openstacklocal":8020; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:776) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client.call(Client.java:1480) ~[stormjar.jar:?] at org.apache.hadoop.ipc.Client.call(Client.java:1407) ~[stormjar.jar:?] at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) ~[stormjar.jar:?] at com.sun.proxy.$Proxy45.getFileInfo(Unknown Source) ~[?:?] at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771) ~[stor mjar.jar:?] <snip>
Eventually, the topology dies with exception:
2017-09-11 16:04:47.080 o.a.s.util [ERROR] Halting process: ("Worker died") java.lang.RuntimeException: ("Worker died") at org.apache.storm.util$exit_process_BANG_.doInvoke(util.clj:341) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37] at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.7.0.jar:?] at org.apache.storm.daemon.worker$fn__7178$fn__7179.invoke(worker.clj:765) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37] at org.apache.storm.daemon.executor$mk_executor_data$fn__6390$fn__6391.invoke(executor.clj:275) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1. 2.5.3.0-37] at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:494) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37] at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
Discussed with nallen about this behavior when testing for BUG-85655 and creating this as a separate issue.
Besides doing kinit, we also need to authorize for Kafka, Storm and Hbase by following the steps here:
- https://github.com/apache/metron/blob/master/metron-deployment/Kerberos-manual-setup.md#kafka-authorization
- https://github.com/apache/metron/blob/master/metron-deployment/Kerberos-manual-setup.md#hbase-authorization
- https://github.com/apache/metron/blob/master/metron-deployment/Kerberos-manual-setup.md#storm-authorization