Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-18171

Descibe nodes start/stop scenarios

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Major
    • Resolution: Done
    • None
    • None
    • sql

    Description

      Definitions.

      We can distinguish next cluster node groups, see below. Each node may be part of one or more groups.

      • Cluster Management Group (CMG), that control new nodes join process.
      • MetaStorage group (MSG), that hosts meta storage.
      • Data node group (DNG), that just hosts tables partitions.

      The components (CMG, meta storage, tables components) are depends on each other, but may resides on different (even disjoint) node subsets. So, some components may become temporary unavailable, and dependant components must be aware of such issues and handle them (wait, retry, throw exception or whatever) in expected way, which has to be documented also.
      See IEP for details

      Motivation.

      As of now, the correct way to start the grid (after it was stopped) is: start CMG nodes, then Meta Storage nodes, then Data nodes. And in backward order for correct stop. Other scenarios are not tested and may lead to unexpected behaviour.

      Let's describe all possible scenarios, expected behaviour for each of them and extend test coverage.

       

      Results.

      Scenarios to check

      • Startup scenarios, when nodes start in different order to check grid assembles and operates correctly. Seems, it make sense when CMG node start first, because grid can't be assembled in otherwise. 
      • Restart scenarios, when stop-then-start node of different roles on various grid configurations, and check services degradation/restoration. In contrast to startup scenarios, some grid configuration work different, e.g. grid without CMG node.
      • Stop scenarios are covered in "restart scenarios."

      Operations

      • Operation in RO transaction. This requires at least one follower alive.
      • Operation in RW transaction. This requires quorum (leader).
      • Operation in implicit transaction.
      • Creating distribution zone. This requires MetaStorage quorum (+ maybe data nodes). Not implemented yet.
      • DDL operation. Create table in distribution zone, which is has quorum/followers without quorum/no followers. This requires Metastorage quorum.
      • Starting new node and check it present in logical and/or physical topologies. This requires CMG quorum (+maybe Metastorage).

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            amashenkov Andrey Mashenkov
            amashenkov Andrey Mashenkov
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 10m
                10m

                Slack

                  Issue deployment