Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-22724

(flaky) aimem: node doesn't recover after data clean and goes into state BROKEN

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.0
    • None
    • persistence
    • 3 distributed nodes (with 1 CMG), storage AIMEM

    • Docs Required, Release Notes Required

    Description

      This issue sometimes is replaced by https://issues.apache.org/jira/browse/IGNITE-22725

       

      Steps to reproduce:

      1. Create 3 nodes cluster with 1 CMG node (node_0 - CMG,node_1,node_2).
      2. Create zone with replication equals to amount of nodes (3).
      3. Create 10 tables inside the zone.
      4. Insert 100 rows in every table.
      5. Await all tables*partitions*nodes local state is "HEALTHY"
      6. Await all tables*partitions*nodes global state is "AVAILABLE"
      7. Kill non CMG node with kill -9. (kill node_1)
      8. Clean data in work directory of killed node (node_1).
      9. Start killed node.
      10. Using REST API await physical topology has 3 alive nodes.
      11. Using REST API await logical topology has 3 alive nodes.
      12. Await all tables*partitions*nodes local state is "HEALTHY" (by connecting to REST of node_2).

      Expected:
      All partitions become "HEALTHY".

      Actual:
      Partitions in sates: "HEALTHY" and "BROKEN". servers_logs.zip

      Comments:

      This test works fine with aipersist storage.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            lunigorn Igor

            Dates

              Created:
              Updated:

              Slack

                Issue deployment