Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-14727

Cluster create looping on TopologyManager areHostGroupsResolved

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.1.2
    • Fix Version/s: None
    • Component/s: ambari-server, blueprints
    • Labels:
      None
    • Environment:

      Ubuntu 12.04
      HDP 2.3
      Ambari 2.1.2

      Description

      Installing a cluster from a blueprint. There are two host groups "server_group" and "agent_group". When the cluster is installed, the server is the only host installed with the agent installing in a later step.

      This worked fine until the "agent_group" host group was augmented with a "ZOOKEEPER_SERVER" instance (making a total of two zookeeper servers).

      With this change, the installation stalls at 0 percent with no errors logged. A success log is repeated however, indicating that there is an unlogged critical failure.

      The only similar issue I could find to this was AMBARI-10811. Based on that, I have a feeling that the root cause here is that having two ZOOKEEPER_SERVER components activates some HA requirements.

      The ambari-server log loops on this line:
      INFO [pool-3-thread-1] TopologyManager:598 - TopologyManager.ConfigureClusterTask areHostGroupsResolved: host group name = server_group has been fully resolved, as all 1 required hosts are mapped to 1 physical hosts.

      Looking at the source for TopologyManager main loop, it appears as if " completed = areRequiredHostGroupsResolved(requiredHostGroups)" line is never getting a TRUE result. However, the only logging from "areRequiredHostGroupsResolved" is the previously mentioned line, which indicates a TRUE result.

      I think the failure case in the areRequiredHostGroupsResolved is being triggered without logging. The logging for failure is wrapped in an IF condition without guaranteed logging:

      if (groupInfo != null) {
      LOG.info("TopologyManager.ConfigureClusterTask areHostGroupsResolved: host group name = {} requires {} hosts to be mapped, but only {} are available.",
      groupInfo.getHostGroupName(), groupInfo.getRequestedHostCount(), groupInfo.getHostNames().size());
      }

      There should be logging outside of the condition or in an ELSE segment.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              anilm2 Anil Mahajan
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: