Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-20052

Release all locks locally on self primary replica expiration

    XMLWordPrintableJSON

Details

    Description

      Motivation

      It is not only useless, but also harmful to keep locks on an expired primary because corresponding commitTimestamps are either calculated or the transaction will be aborted.

      Definition of Done

      • All local partition specific locks are released on self primary replica expiration.

      Implementation Notes

      • It's required to introduce local onPrimaryExpired callback.
      • An open question here is how to detect whether a given primary hosted any locks.
      • We've discussed and agreed that the following test should be written. It might not be the only one to write, however it's definitely useful.
        • Start two nodes A and B with partition P1 on node A and partition P2 on node B.
        • Begin transaction Tx1 on node B.
        • Touch P2 on B
        • Touch P1 on A
        • Kill Node B, meaning kill Tx coordinator, commit partition and P2.
        • Discard P1 lease prolongation.
        • Await P1 lease expiration and check that locks were released.
      • In order to discard lease prolongation, we may add special placement driver methods that will add an ability to discard or transfer lease. At least they'll be useful within testing.
      • We've agreed that we may duplicate org.apache.ignite.internal.table.distributed.raft.PartitionListener#txsPendingRowIds in order to have it in PartitionReplicaListener. That will allow us to handle primaryReplica.onExpired() in a following way (pseudocode)
        txsPendingRowIds.keySet().forEach(txId -> lockManager.unlock(txId))

        Besides that, given map is required in order to cleanup writeIntents on primary if primary isn't a part of a replication group.

      • Seems that we don't need the whole map, but only the keySet, meaning txIds. Because corresponding value Set<RowId> is used in order to cleanup writeIntetns and the only case why we need rowIds on primary explicitly is when primary itself ins't the part of replication group. And if it's true rebalance engine will drop the whole old primary local partition with all corresponding write intents.

      Attachments

        Issue Links

          Activity

            People

              v.pyatkov Vladislav Pyatkov
              alapin Alexander Lapin
              Kirill Sizov Kirill Sizov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h
                  2h