Uploaded image for project: 'Accumulo'
  1. Accumulo
  2. ACCUMULO-3497

Poor error when bind-address of server doesn't match with kerberos principal

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.7.0
    • rpc
    • None

    Description

      I used the generated configuration (in assemble/accumulo-$VERSION-dev/accumulo-$VERSION) and got errors in the master and tserver:

      TServer
      java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Peer indicated failure: GSS initiate failed
              at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
              at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51)
              at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:356)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1608)
              at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48)
              at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
              at java.lang.Thread.run(Thread.java:745)
      Caused by: org.apache.thrift.transport.TTransportException: Peer indicated failure: GSS initiate failed
              at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:190)
              at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
              at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
              at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
              at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
              ... 10 more
      
      Master
      2015-01-19 17:07:55,505 [transport.TSaslTransport] ERROR: SASL negotiation failure
      javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - LOOKING_UP_SERVER)]
              at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
              at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
              at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
              at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
              at org.apache.accumulo.core.rpc.UGIAssumingTransport$1.run(UGIAssumingTransport.java:53)
              at org.apache.accumulo.core.rpc.UGIAssumingTransport$1.run(UGIAssumingTransport.java:49)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
              at org.apache.accumulo.core.rpc.UGIAssumingTransport.open(UGIAssumingTransport.java:49)
              at org.apache.accumulo.core.rpc.ThriftUtil.createClientTransport(ThriftUtil.java:358)
              at org.apache.accumulo.core.client.impl.ThriftTransportPool.createNewTransport(ThriftTransportPool.java:478)
              at org.apache.accumulo.core.client.impl.ThriftTransportPool.getTransport(ThriftTransportPool.java:411)
              at org.apache.accumulo.core.client.impl.ThriftTransportPool.getTransport(ThriftTransportPool.java:389)
              at org.apache.accumulo.core.rpc.ThriftUtil.getClient(ThriftUtil.java:122)
              at org.apache.accumulo.server.master.LiveTServerSet$TServerConnection.halt(LiveTServerSet.java:118)
              at org.apache.accumulo.master.Master.gatherTableInformation(Master.java:1009)
              at org.apache.accumulo.master.Master.access$600(Master.java:160)
              at org.apache.accumulo.master.Master$StatusThread.updateStatus(Master.java:911)
              at org.apache.accumulo.master.Master$StatusThread.run(Master.java:901)
      Caused by: GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - LOOKING_UP_SERVER)
              at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:710)
              at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:248)
              at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
              at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
              ... 19 more
      Caused by: KrbException: Server not found in Kerberos database (7) - LOOKING_UP_SERVER
              at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:73)
              at sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:192)
              at sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:203)
              at sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:309)
              at sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:115)
              at sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:454)
              at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:641)
              ... 22 more
      

      This error occurs due to fact that DNS is so closely tied to the authentication. The default configuration used localhost instead of the FQDN in hosts files (masters, monitors, slaves, tracers, gc). This ultimately created a mismatch between the instance component of the kerberos principal (I used the FQDN) while the thrift server using the FQDN.

      We should detect when this happens and throw an intuitive error.

      Attachments

        Activity

          People

            elserj Josh Elser
            elserj Josh Elser
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 20m
                20m