Uploaded image for project: 'Sling'
  1. Sling
  2. SLING-10489

Ignore partially started, newly joining instances to avoid disturbing discovery (for a while)

    XMLWordPrintableJSON

Details

    Description

      Discovery.oak requires that both Oak and Sling are operating normally in order to declare victory and announce a new topology.

      The startup phase is especially tricky in this regard, since there are multiple elements that need to get updated (some are in the Oak layer, some in Sling) :

      • lease & clusterNodeId : this is maintained by Oak
      • idMap : this is maintained by IdMapService (Sling)
      • leaderElectionId : this is maintained by OakViewChecker (Sling)
      • syncToken : this is maintained by SyncTokenService (Sling)

      Situations have been seen where Oak starts up fine, but higher level (eg Sling) bundles were not activated within a reasonable amount of time. This lead to discovery staying in TOPOLOGY_CHANGING state for longer than expected.

      There should be a mechanism that ignores (suppresses) newly joining instances if they start up only partially. However, after a certain timeout this mechanism should give up.

      Attachments

        Activity

          People

            stefanegli Stefan Egli
            stefanegli Stefan Egli
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 8h 10m
                8h 10m