Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-19043

ItRaftCommandLeftInLogUntilRestartTest: PageMemoryHashIndexStorage lacks data after cluster restart

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.0.0-beta2
    • None

    Description

      After enabling ItRaftCommandLeftInLogUntilRestartTest failed with

      org.opentest4j.AssertionFailedError: expected: not <null> 

      while trying to retrieve previously added data after cluster restart. Seems that it's because there's no corresponding data in PK index.

      It is worth to mention that originally given test is about about raft log re-application on node restart. So, I've commented all  partitionUpdateInhibitor in order to check whether it's related to re-application or indexes themselves, problem is reproducible without re-application logic.

      It might be related to rocks to page memory defaults migration. Further investigation required.

      Implementation notes

      After the investigation it's occurred that the reason of the failure is that raft log re-appliance is skipped within PartitionListener#handleUpdateCommand and PartitionListener#handleUpdateAllCommand because of following logic

              TxMeta txMeta = txStateStorage.get(cmd.txId());
              if (txMeta != null && (txMeta.txState() == COMMITED || txMeta.txState() == ABORTED)) {
                  storage.runConsistently(() -> {
                      storage.lastApplied(commandIndex, commandTerm);
                      return null;
                  });
              } 
       
      

      Full scenario is following:

      1. tx1.put populates raft log and mvPartitionStorage with corresponding log record and data.

      2. tx1.commit also populates raft log with raft record and finished the transaction within txnStateStorage along wiht cleanup in mvPartitionStorage.

      3. RocksDB based txnStateStorage flushes its state to a disk and page memory based doesn't.

      4. After node restart raft replays the log, both put and commit commands, however on commit partition we skip put re-application  because of aforementioned

      if (txMeta != null && (txMeta.txState() == COMMITED || txMeta.txState() == ABORTED))

      Just in case, transaction is considered to be committed because txnStateStorage flushes its state before stop.

       

      So, in order to fix given issue it's enough to just remove the skip logic.

      Attachments

        Issue Links

          Activity

            People

              alapin Alexander Lapin
              alapin Alexander Lapin
              Denis Chudov Denis Chudov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m