Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-11121

MVCC TX: AssertionError in discovery manager on BLT change.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 2.8
    • mvcc
    • None

    Description

      The next exception occurred in logs on BLT change.

      [12:11:36,912][SEVERE][sys-#87][GridClosureProcessor] Closure execution failed with error.
      java.lang.AssertionError
              at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.node(GridDiscoveryManager.java:1794)
              at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:1693)
              at org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$NodeFailureTimeoutObject.lambda$onTimeout0$16553d7$1(IgniteTxManager.java:2592)
              at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:399)
              at org.apache.ignite.internal.util.future.GridFutureAdapter.listen(GridFutureAdapter.java:354)
              at org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$NodeFailureTimeoutObject.onTimeout0(IgniteTxManager.java:2588)
              at org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$NodeFailureTimeoutObject.access$3300(IgniteTxManager.java:2505)
              at org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$NodeFailureTimeoutObject$1.run(IgniteTxManager.java:2623)
              at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6874)
              at org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827)
              at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at java.lang.Thread.run(Thread.java:748)
      

      From the stack trace I see there is a node failure which causes transactions recovery and uninitialized Mvcc coordinator (it means there are no server nodes, or there is a coordinatorAssignClosure which returns no result, or a recovering node was not activated)

      the scenario, where the exception may be observed:

      1. Start a cluster
      2. Load some data (from client node, the client node is shut down after that)
      3. Calculate hash
      4. Add new server node
      5. Change BLT
      6. Wait for rebalance
      7. Calculate new hash and check it is the same as previously calculated

      Attachments

        Issue Links

          Activity

            People

              Pavlukhin Ivan Pavlukhin
              gvvinblade Igor Seliverstov
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m