Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-12720

Blueprint Logical Request stuck in waiting mode during large cluster deployments

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 2.1.0
    • 2.1.1
    • ambari-server
    • None

    Description

      During Blueprint deployments involving large cluster sizes (50 or more nodes), there is an intermittent failure that occurs in which a logical request never completes, since one or more expected host registrations do not complete, and so the request can not be fully resolved. This results in the UI showing that the logical request is pending, and the cluster fails to deploy to completion.

      This tends to happen under heavy load with large cluster sizes. This also tends to happen more frequently when hosts in the cluster are registered with the TopologyManager during the Blueprint configuration phase.

      This appears to be a concurrency problem with the TopologyManager.

      I'm working on a fix for this, and will be submitting a patch shortly.

      Attachments

        1. AMBARI-12720.patch.2
          6 kB
          Bob Nettleton

        Issue Links

          Activity

            People

              rnettleton Bob Nettleton
              rnettleton Bob Nettleton
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 24h
                  24h
                  Remaining:
                  Remaining Estimate - 24h
                  24h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified