Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-14727

Cluster create looping on TopologyManager areHostGroupsResolved

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.1.2
    • None
    • ambari-server, blueprints
    • None
    • Ubuntu 12.04
      HDP 2.3
      Ambari 2.1.2

    Description

      Installing a cluster from a blueprint. There are two host groups "server_group" and "agent_group". When the cluster is installed, the server is the only host installed with the agent installing in a later step.

      This worked fine until the "agent_group" host group was augmented with a "ZOOKEEPER_SERVER" instance (making a total of two zookeeper servers).

      With this change, the installation stalls at 0 percent with no errors logged. A success log is repeated however, indicating that there is an unlogged critical failure.

      The only similar issue I could find to this was AMBARI-10811. Based on that, I have a feeling that the root cause here is that having two ZOOKEEPER_SERVER components activates some HA requirements.

      The ambari-server log loops on this line:
      INFO [pool-3-thread-1] TopologyManager:598 - TopologyManager.ConfigureClusterTask areHostGroupsResolved: host group name = server_group has been fully resolved, as all 1 required hosts are mapped to 1 physical hosts.

      Looking at the source for TopologyManager main loop, it appears as if " completed = areRequiredHostGroupsResolved(requiredHostGroups)" line is never getting a TRUE result. However, the only logging from "areRequiredHostGroupsResolved" is the previously mentioned line, which indicates a TRUE result.

      I think the failure case in the areRequiredHostGroupsResolved is being triggered without logging. The logging for failure is wrapped in an IF condition without guaranteed logging:

      if (groupInfo != null) {
      LOG.info("TopologyManager.ConfigureClusterTask areHostGroupsResolved: host group name = {} requires {} hosts to be mapped, but only {} are available.",
      groupInfo.getHostGroupName(), groupInfo.getRequestedHostCount(), groupInfo.getHostNames().size());
      }

      There should be logging outside of the condition or in an ELSE segment.

      Attachments

        Activity

          People

            Unassigned Unassigned
            anilm2 Anil Mahajan
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: