Uploaded image for project: 'Metron (Retired)'
  1. Metron (Retired)
  2. METRON-1178

kinit + authorize on all nodes running topologies for multi-node metron deployments

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Done
    • Major
    • Resolution: Done
    • None
    • None
    • None

    Description

      In a 12-node deployment, it is seen post-kerberization that some of the topologies fail to start with exceptions thrown like one pasted below. This is because the node does not have the proper credentials, since kinit has not occurred on this host.

      2017-09-11 16:04:46.923 o.a.h.i.Client [WARN] Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS ini
      tiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
      2017-09-11 16:04:46.930 o.a.m.p.GrokParser [ERROR] Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initi
      ate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local hos
      t is: "metron-2/xx.xx.xx.xx; destination host is: "metron-2.openstacklocal":8020;
      java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSExcep
      tion: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "nat-r7-lqys-metron-2/17
      2.22.104.43"; destination host is: "nat-r7-lqys-metron-2.openstacklocal":8020;
      	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:776) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client.call(Client.java:1480) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client.call(Client.java:1407) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) ~[stormjar.jar:?]
      	at com.sun.proxy.$Proxy45.getFileInfo(Unknown Source) ~[?:?]
      	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771) ~[stor
      mjar.jar:?]
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_141]
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_141]
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_141]
      	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_141]
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) ~[stormjar.jar:?]
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) ~[stormjar.jar:?]
      	at com.sun.proxy.$Proxy46.getFileInfo(Unknown Source) ~[?:?]
      	at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2116) ~[stormjar.jar:?]
      	at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305) ~[stormjar.jar:?]
      	at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) ~[stormjar.jar:?]
      	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[stormjar.jar:?]
      	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317) ~[stormjar.jar:?]
      	at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424) ~[stormjar.jar:?]
      	at org.apache.metron.parsers.GrokParser.openInputStream(GrokParser.java:83) ~[stormjar.jar:?]
      	at org.apache.metron.parsers.GrokParser.init(GrokParser.java:94) [stormjar.jar:?]
      	at org.apache.metron.parsers.bolt.ParserBolt.prepare(ParserBolt.java:108) [stormjar.jar:?]
      	at org.apache.storm.daemon.executor$fn__6573$fn__6586.invoke(executor.clj:798) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37]
      	at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:482) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37]
      	at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
      	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
      Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (
      Mechanism level: Failed to find any Kerberos tgt)]
      	at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:682) ~[stormjar.jar:?]
      	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_141]
      	at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_141]
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:645) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:732) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1529) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client.call(Client.java:1446) ~[stormjar.jar:?]
      	... 24 more
      Caused by: javax.security.sasl.SaslException: GSS initiate failed
      	at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) ~[?:1.8.0_141]
      	at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:414) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:555) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:370) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:724) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:720) ~[stormjar.jar:?]
      	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_141]
      	at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_141]
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:720) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1529) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client.call(Client.java:1446) ~[stormjar.jar:?]
      	... 24 more
      Caused by: org.ietf.jgss.GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
      	at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) ~[?:1.8.0_141]
      	at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122) ~[?:1.8.0_141]
      	at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187) ~[?:1.8.0_141]
      	at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224) ~[?:1.8.0_141]
      	at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) ~[?:1.8.0_141]
      	at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) ~[?:1.8.0_141]
      	at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192) ~[?:1.8.0_141]
      	at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:414) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:555) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:370) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:724) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:720) ~[stormjar.jar:?]
      	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_141]
      	at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_141]
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:720) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1529) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client.call(Client.java:1446) ~[stormjar.jar:?]
      	... 24 more
      2017-09-11 16:04:46.935 o.a.s.util [ERROR] Async loop died!
      java.lang.RuntimeException: Grok parser Error: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate
      failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is
      : "metron-2/xx.xx.xx.xx"; destination host is: "metron-2.openstacklocal":8020;
      	at org.apache.metron.parsers.GrokParser.init(GrokParser.java:123) ~[stormjar.jar:?]
      	at org.apache.metron.parsers.bolt.ParserBolt.prepare(ParserBolt.java:108) ~[stormjar.jar:?]
      	at org.apache.storm.daemon.executor$fn__6573$fn__6586.invoke(executor.clj:798) ~[storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37]
      	at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:482) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37]
      	at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
      	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
      Caused by: java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused
      by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "metron-2/xx.xx.xx.xx"; destination host is: "metron-2.openstacklocal":8020;
      	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:776) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client.call(Client.java:1480) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.Client.call(Client.java:1407) ~[stormjar.jar:?]
      	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) ~[stormjar.jar:?]
      	at com.sun.proxy.$Proxy45.getFileInfo(Unknown Source) ~[?:?]
      	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771) ~[stor
      mjar.jar:?]
      <snip>
      

      Eventually, the topology dies with exception:

      2017-09-11 16:04:47.080 o.a.s.util [ERROR] Halting process: ("Worker died")
      java.lang.RuntimeException: ("Worker died")
      	at org.apache.storm.util$exit_process_BANG_.doInvoke(util.clj:341) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37]
      	at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.7.0.jar:?]
      	at org.apache.storm.daemon.worker$fn__7178$fn__7179.invoke(worker.clj:765) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37]
      	at org.apache.storm.daemon.executor$mk_executor_data$fn__6390$fn__6391.invoke(executor.clj:275) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.
      2.5.3.0-37]
      	at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:494) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37]
      	at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
      	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
      

      Discussed with nallen about this behavior when testing for BUG-85655 and creating this as a separate issue.

      Besides doing kinit, we also need to authorize for Kafka, Storm and Hbase by following the steps here:

      Attachments

        Activity

          People

            Unassigned Unassigned
            anandsubbu Anand Subramanian
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: