Uploaded image for project: 'Ambari (Retired)'
  1. Ambari (Retired)
  2. AMBARI-25672

delete a host from a kerberos cluster not completely clear all components kerberos identities in database and kdc

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.7.3
    • None
    • ambari-server
    • None

    Description

      step 1:

      1. delete a host from a kerberos cluster ,not a master host
      2. stop all the service on the host,
      3. use api delete host  

      step 2:

      1. prepare a host, install agent
      2. add a node to the cluster use api and install service
      3. regenerate_keytab
      4. ambari hang at preparing operations/hostname/preparing operations

      it is because step1.3 cannot completely clear all this host componets kerberos idetities in both database(mysql ) and kdc(kdc.admin) 

      • in mysql

                there are 4 table kkp_mapping_service, kerberos_keytab_principal, kerberos_keytab,kerberos_principal, host related kerberos identities in these tables must be deleted completely,

      • in kdc , 
        kadmin.local
        listprincs *hostnanme*

        will find related identies not deleted completely

      some services kerberos identies in mysql and kdc can be deleted but some sevices not,

      if not all service kerberos identies deleted completely,if any service kerberos identities left ,next time add a host to this cluster, will hang at preparing operations

       

      delete host api call chain in ambari-server

      org.apache.ambari.server.api.services.HostService#deleteHost
      org.apache.ambari.server.api.services.BaseService#handleRequest
      org.apache.ambari.server.api.services.BaseRequest#process
      org.apache.ambari.server.api.handlers.BaseManagementHandler#handleRequest
      org.apache.ambari.server.api.handlers.DeleteHandler#persist
      org.apache.ambari.server.api.services.persistence.PersistenceManagerImpl#delete
      org.apache.ambari.server.controller.internal.ClusterControllerImpl#deleteResources
      org.apache.ambari.server.controller.internal.AbstractAuthorizedResourceProvider#deleteResources
      org.apache.ambari.server.controller.internal.HostResourceProvider#deleteResourcesAuthorized
      org.apache.ambari.server.controller.internal.HostResourceProvider#deleteHosts
      A=org.apache.ambari.server.controller.internal.HostResourceProvider#processDeleteHostRequests
      

       A=org.apache.ambari.server.controller.internal.HostResourceProvider#processDeleteHostRequests has some main step

      A1=org.apache.ambari.server.controller.AmbariManagementControllerImpl#deleteHostComponents  //this step will delete components and their kerbers identities
      A2=org.apache.ambari.server.state.cluster.ClustersImpl#deleteHost //this step will delete host from mysql

       

       A1=org.apache.ambari.server.controller.AmbariManagementControllerImpl#deleteHostComponents call chain

      org.apache.ambari.server.state.ServiceComponentImpl#deleteServiceComponentHosts
      A1-1=org.apache.ambari.server.state.svccomphost.ServiceComponentHostImpl#delete
      

      A1-1=org.apache.ambari.server.state.svccomphost.ServiceComponentHostImpl#delete call chain

      org.apache.ambari.server.state.cluster.ClusterImpl#removeServiceComponentHost
      A1-1-1=eventPublisher.publish(event);  //publish ServiceComponentUninstalledEvent,org.apache.ambari.server.controller.utilities.KerberosIdentityCleaner#componentRemoved will deal this event,and delete components kerberos identites,these event once publish,next line code will execute,not wait these event finish,
      

      A1-1-1=org.apache.ambari.server.controller.utilities.KerberosIdentityCleaner#componentRemoved call chain

      org.apache.ambari.server.controller.utilities.RemovableIdentities#remove
      org.apache.ambari.server.controller.KerberosHelperImpl#deleteIdentities(org.apache.ambari.server.state.Cluster, java.util.List<org.apache.ambari.server.serveraction.kerberos.Component>, java.util.Set<java.lang.String>)
      org.apache.ambari.server.controller.KerberosHelperImpl#validateKDCCredentials(org.apache.ambari.server.controller.KerberosDetails, org.apache.ambari.server.state.Cluster) //check KDC administrator credentials
      A1-1-1-1=org.apache.ambari.server.controller.DeleteIdentityHandler#addDeleteIdentityStages //add stage in prepare delete identies
      

      A1-1-1-1=org.apache.ambari.server.controller.DeleteIdentityHandler#addDeleteIdentityStages call chain

      if (manageIdentities) {
        addPrepareDeleteIdentity(cluster, hostParamsJson, event, commandParameters, stageContainer);
        addDeleteKeytab(cluster, commandParameters.getAffectedHostNames(), hostParamsJson, commandParameters, stageContainer);
        addDestroyPrincipals(cluster, hostParamsJson, event, commandParameters, stageContainer);
      }
      org.apache.ambari.server.controller.DeleteIdentityHandler#addDeleteKeytab //check hostexists to decide whether create this stage,in order to delete component kerberos identities, this stage should not be created,that is to say,host is exist judgement should be false,because A2 has delete this host from mysql
      org.apache.ambari.server.controller.DeleteIdentityHandler#addDestroyPrincipals 
      org.apache.ambari.server.serveraction.kerberos.DestroyPrincipalsServerAction#execute // delete components kerberos identites both in mysql and kdc,use kerberosKeytabPrincipalEntities = kerberosKeytabPrincipalDAO.findByFilters(filters); to get kerberosKeytabPrincipalEntities and delete,in order to delete component kerberos identies,kerberosKeytabPrincipalEntities size should not be 0,that is to say org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#findByFilter should not return empty
      A-1-1-1-1-1=org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#findByFilte
      

      A-1-1-1-1-1=org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#findByFilte call chain

      for (String hostname : filter.getHostNames()) {
      HostEntity host = hostDAO.findByName(hostname); //find host host=null hasnull=true,if only one host ,this host is re-inserted,will find this host,but this host id has no identies in mysql kkp tables,
      Predicate hostIDPredicate = (hostIds.isEmpty()) ? null : root.get("hostId").in(hostIds);
      Predicate hostNullIDPredicate = (hasNull) ? root.get("hostId").isNull() : null;
      

      A2=org.apache.ambari.server.state.cluster.ClustersImpl#deleteHost call chain

      org.apache.ambari.server.state.cluster.ClustersImpl#deleteHostEntityRelationships
      org.apache.ambari.server.state.cluster.ClustersImpl#unmapHostFromClusters 
      org.apache.ambari.server.state.cluster.ClustersImpl#unmapHostClusterEntities //delete host cluster mapping  
      org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#removeByHost
      hostDAO.remove(entity); // Note, if the host is still heartbeating, then new records will be re-inserted into the hosts and hoststate tables
      

      there are 4 reason why some service kerberos identies can not be deleted

      • one, lost kdc.admin.credential , maybe caused by ambari-server restart

      solve: make sure when delete host kdc.admin.credential exist,if not ,use post to add it

      • second,A1-1-1-1 execute before A2,that is addDeleteKeytab check host exist(A2 not excute ,so host exist),so add this stage but if this stage exeucte it absolutely cause error,so this ServiceComponentUninstalledEvent fail,the compoent in the event will left kerberos identity in mysql and kdc

      solve: check more times in addDeleteKeytab,wait A2 finish,most times,A2 finish before A1-1-1-1,no more than 1 or 2 second

      • third, A2 execute,but host heartbeating,re-inserted into hosts,A1-1-1-1execute,fall into addDeleteKeytab stage,error 

      solve: check host exist in addDeleteKeytab plus host in any cluster check to make sure this host not a re-inserted host,because re-inserted host has no cluster to mapping

      • fourth,A1-1-1-1 filter kerberosKeytabPrincipalEntities(kkpes) use A-1-1-1-1-1 but find a re-inserted host so kkpes is size 0 ,this ServiceComponentUninstalledEvent will left componets kerberos identies in mysql and kdc

      solve: A-1-1-1-1-1check host eixst plus host is in cluster to exlude re-inserted host when there is only one host in findByFilter method, (if more than one host use this method ,no error)

      Attachments

        Activity

          People

            Unassigned Unassigned
            h.s h.s
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: