Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-8893

Blinking node in baseline may corrupt own WAL records

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.5
    • None
    • None
    • None

    Description

      1. Start cluster, load data
      2. Start additional node that not in BLT
      3. Repeat 10 times: kill 1 node in baseline and 1 node not in baseline, start node in blt and node not in BLT

      Node in baseline in some moment may unable to start because of corrupted WAL:
      Notice that there is no loading on cluster at all - so there is no reason to corrupt WAL, rebalance should be interruptible.

      Also there is another scenario that may case same error (but also may cause JVM crash)

      1. Start cluster, load data, start nodes
      2. Repeat 10 times: kill 1 node in baseline, clean LFS, start node again, while rebalance blink node that should rebalance data to previously killed node

      Node that should rebalance data to cleaned node may corrupt own WAL. But this second scenario has configuration "error" - number of backups in each case is 1. So obviously 2 nodes blinking actually may cause data loss.

      [2018-06-28 17:33:39,583][ERROR][wal-file-archiver%null-#63][root] Critical system error detected. Will be handled accordingly to configured handler [hnd=class o.a.i.failure.StopNodeOrHaltFailureHandler, failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=java.lang.AssertionError: lastArchived=757, current=42]]
      java.lang.AssertionError: lastArchived=757, current=42
              at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileArchiver.body(FileWriteAheadLogManager.java:1629)
              at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              qvad Dmitry Sherstobitov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: