Uploaded image for project: 'Brooklyn'
  1. Brooklyn
  2. BROOKLYN-580

Rebinding to MachineEntity: sometimes fails to reconnect sensor feeds

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.12.0
    • Fix Version/s: 1.0.0
    • Labels:
      None

      Description

      On rebind, sometimes MachineEntity instances do not have their feeds recreated. This is illustrated by non-deterministic test failure in MachineEntityJcloudsRebindTest.

      The problem is that SoftwareProcessImpl.callRebindHooks schedules a task to call connectSensors in something between 0 and 10 seconds time, which will try to recreate the feeds. However, if this executes too soon (while rebind is still happening), the SshMachineLocation may not yet be managed. If that is the case, the feed is not created.

      This is most likely to happen if there are a lot of entities/locations, so iterating over them for rebind takes longer. It is random in that the delay in calling connectSensors can sometimes be extremely short (the randomness there is to avoid the thundering herd problem on rebind).

      Although the symptoms are similar to https://issues.apache.org/jira/browse/BROOKLYN-425, the underlying cause is different - therefore treating this as a new issue rather than reopening the old one.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                aled.sage Aled Sage
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: