Description
I used the generated configuration (in assemble/accumulo-$VERSION-dev/accumulo-$VERSION) and got errors in the master and tserver:
TServer
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Peer indicated failure: GSS initiate failed at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51) at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:356) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1608) at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.thrift.transport.TTransportException: Peer indicated failure: GSS initiate failed at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:190) at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) ... 10 more
Master
2015-01-19 17:07:55,505 [transport.TSaslTransport] ERROR: SASL negotiation failure javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - LOOKING_UP_SERVER)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212) at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.accumulo.core.rpc.UGIAssumingTransport$1.run(UGIAssumingTransport.java:53) at org.apache.accumulo.core.rpc.UGIAssumingTransport$1.run(UGIAssumingTransport.java:49) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.accumulo.core.rpc.UGIAssumingTransport.open(UGIAssumingTransport.java:49) at org.apache.accumulo.core.rpc.ThriftUtil.createClientTransport(ThriftUtil.java:358) at org.apache.accumulo.core.client.impl.ThriftTransportPool.createNewTransport(ThriftTransportPool.java:478) at org.apache.accumulo.core.client.impl.ThriftTransportPool.getTransport(ThriftTransportPool.java:411) at org.apache.accumulo.core.client.impl.ThriftTransportPool.getTransport(ThriftTransportPool.java:389) at org.apache.accumulo.core.rpc.ThriftUtil.getClient(ThriftUtil.java:122) at org.apache.accumulo.server.master.LiveTServerSet$TServerConnection.halt(LiveTServerSet.java:118) at org.apache.accumulo.master.Master.gatherTableInformation(Master.java:1009) at org.apache.accumulo.master.Master.access$600(Master.java:160) at org.apache.accumulo.master.Master$StatusThread.updateStatus(Master.java:911) at org.apache.accumulo.master.Master$StatusThread.run(Master.java:901) Caused by: GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - LOOKING_UP_SERVER) at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:710) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:248) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193) ... 19 more Caused by: KrbException: Server not found in Kerberos database (7) - LOOKING_UP_SERVER at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:73) at sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:192) at sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:203) at sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:309) at sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:115) at sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:454) at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:641) ... 22 more
This error occurs due to fact that DNS is so closely tied to the authentication. The default configuration used localhost instead of the FQDN in hosts files (masters, monitors, slaves, tracers, gc). This ultimately created a mismatch between the instance component of the kerberos principal (I used the FQDN) while the thrift server using the FQDN.
We should detect when this happens and throw an intuitive error.