Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-22980

Lock manager may fail and lock waiter simultaneously

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.1
    • None
    • Docs Required, Release Notes Required

    Description

      Motivation

      The behavior was hardly predicted or planned. But currently, we can acquire a lock:

              private void lock() {
                  lockMode = intendedLockMode;
      
                  intendedLockMode = null;
      
                  intendedLocks.clear();
              }
      

      and made the waiter fail:

              private void fail(LockException e) {
                  ex = e;
              }
      

      without limitation (assertion checking or explicitly prohibition).

      Scenario:

      • tx1 tries to acquire a lock and finds conflicting transaction tx2;
      • lock manager tries to check the state and coordinator of tx2;
      • coordinator of tx2 has left, so TxRecoveryMessage is sent;
      • the primary replica of commit partition of tx2 is on the same node, so TxRecoveryMessage is sent locally. It also triggers the tx recovery, so tx2 is finished and tx cleanup is performed locally. All of this happens in the same thread, and during txn cleanup the locks of tx2 are released;
      • the release of locks of tx2 allows the conflicting waiter of tx1 to acquire a lock;
      • the processing of conflicting transaction continues and #fail is called on the same waiter.

      There is also another problem: tx recovery shouldn't happen within synchronized block of HeapLockManager. It can be moved to another pool, and this also won't allow the tx recovery, which releases the locks, to grant lock for waiter of tx1.

      Definition of done

      • Only one method can be applied to a lock attempt ether lock() or fail(), but not both. Do not forget, a retry attempt may be successful even though the previous attempt failed. Also, there are cases of lock upgrade: S-lock can be taken, but attempt to upgrade it to X-lock can fail, there will be another lock future and it will be completed exceptionally, meanwhile S-lock would be still active;
      • tx recovery is not executed synchronously within synchronized block of HeapLockManager.

      Attachments

        Issue Links

          Activity

            People

              v.pyatkov Vladislav Pyatkov
              v.pyatkov Vladislav Pyatkov
              Denis Chudov Denis Chudov
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m