Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-13652

Deadlock in AbstractCommitLogSegmentManager

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Fix Version/s: 3.11.1, 4.0
    • Component/s: Core
    • Labels:
      None

      Description

      AbstractCommitLogManager uses LockSupport.(un)park incorreclty. It invokes unpark without checking if manager thread was parked in approriate place.
      For example, logging frameworks uses queues and queues uses ReadWriteLock's that uses LockSupport. Therefore AbstractCommitLogManager.wakeManager can wake thread inside Lock and manager thread will sleep forever at park() method (because unpark permit was already consumed inside lock).

      For examle stack traces:

      "MigrationStage:1" id=412 state=WAITING
          at sun.misc.Unsafe.park(Native Method)
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
          at org.apache.cassandra.utils.concurrent.WaitQueue$AbstractSignal.awaitUninterruptibly(WaitQueue.java:279)
          at org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.awaitAvailableSegment(AbstractCommitLogSegmentManager.java:263)
          at org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.advanceAllocatingFrom(AbstractCommitLogSegmentManager.java:237)
          at org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:279)
          at org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:210)
          at org.apache.cassandra.config.Schema.dropView(Schema.java:708)
          at org.apache.cassandra.schema.SchemaKeyspace.lambda$updateKeyspace$23(SchemaKeyspace.java:1361)
          at org.apache.cassandra.schema.SchemaKeyspace$$Lambda$382/1123232162.accept(Unknown Source)
          at java.util.LinkedHashMap$LinkedValues.forEach(LinkedHashMap.java:608)
          at java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1080)
          at org.apache.cassandra.schema.SchemaKeyspace.updateKeyspace(SchemaKeyspace.java:1361)
          at org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1332)
          at org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1282)
            - locked java.lang.Class@cc38904
          at org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:51)
          at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
          at java.util.concurrent.FutureTask.run(FutureTask.java:266)
          at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$LocalSessionWrapper.run(DebuggableThreadPoolExecutor.java:322)
          at com.ringcentral.concurrent.executors.MonitoredRunnable.run(MonitoredRunnable.java:36)
          at MON_R_MigrationStage.run(NamedRunnableFactory.java:67)
          at com.ringcentral.concurrent.executors.MonitoredThreadPoolExecutor$MdcAwareRunnable.run(MonitoredThreadPoolExecutor.java:114)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
          at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
          at org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$61/1733339045.run(Unknown Source)
          at java.lang.Thread.run(Thread.java:745)
      
      "COMMIT-LOG-ALLOCATOR:1" id=80 state=WAITING
          at sun.misc.Unsafe.park(Native Method)
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
          at org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager$1.runMayThrow(AbstractCommitLogSegmentManager.java:128)
          at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
          at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
          at org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$61/1733339045.run(Unknown Source)
          at java.lang.Thread.run(Thread.java:745)
      

      Solution is to use Semaphore instead of low-level LockSupport.

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user Fuud opened a pull request:

          https://github.com/apache/cassandra/pull/127

          CASSANDRA-13652: Deadlock in AbstractCompactionManager

          Change incorrect usage of LockSupport.park/unpark to Semaphore.
          Old implementation can deadlock because permit from unpark invocation can be consumed by park inside logging framework and manager thread will be parked forever.

          https://issues.apache.org/jira/browse/CASSANDRA-13652

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/Fuud/cassandra commitlog_deadlock

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/cassandra/pull/127.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #127


          commit 175be27297e9933906a9261cd8d0af3a772bff24
          Author: Fedor Bobin <fuudtorrentsru@gmail.com>
          Date: 2017-07-01T12:53:22Z

          CASSANDRA-13652: Deadlock in AbstractCompactionManager


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user Fuud opened a pull request: https://github.com/apache/cassandra/pull/127 CASSANDRA-13652 : Deadlock in AbstractCompactionManager Change incorrect usage of LockSupport.park/unpark to Semaphore. Old implementation can deadlock because permit from unpark invocation can be consumed by park inside logging framework and manager thread will be parked forever. https://issues.apache.org/jira/browse/CASSANDRA-13652 You can merge this pull request into a Git repository by running: $ git pull https://github.com/Fuud/cassandra commitlog_deadlock Alternatively you can review and apply these changes as the patch at: https://github.com/apache/cassandra/pull/127.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #127 commit 175be27297e9933906a9261cd8d0af3a772bff24 Author: Fedor Bobin <fuudtorrentsru@gmail.com> Date: 2017-07-01T12:53:22Z CASSANDRA-13652 : Deadlock in AbstractCompactionManager
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user Fuud opened a pull request:

          https://github.com/apache/cassandra/pull/129

          CASSANDRA-13652: Deadlock in AbstractCompactionManager

          PR with same result as #127.
          Instead of small fix in #127 this PR contains refactoring to make AbstractCommitLogSegmentManager code more clear.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/Fuud/cassandra commitlog_deadlock_v2

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/cassandra/pull/129.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #129


          commit e1a695874dc24e532ae21ef627e852bf999a75f3
          Author: Fedor Bobin <fuudtorrentsru@gmail.com>
          Date: 2017-07-07T05:37:22Z

          CASSANDRA-13652: Deadlock in AbstractCompactionManager


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user Fuud opened a pull request: https://github.com/apache/cassandra/pull/129 CASSANDRA-13652 : Deadlock in AbstractCompactionManager PR with same result as #127. Instead of small fix in #127 this PR contains refactoring to make AbstractCommitLogSegmentManager code more clear. You can merge this pull request into a Git repository by running: $ git pull https://github.com/Fuud/cassandra commitlog_deadlock_v2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/cassandra/pull/129.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #129 commit e1a695874dc24e532ae21ef627e852bf999a75f3 Author: Fedor Bobin <fuudtorrentsru@gmail.com> Date: 2017-07-07T05:37:22Z CASSANDRA-13652 : Deadlock in AbstractCompactionManager
          Hide
          jjirsa Jeff Jirsa added a comment -

          Anyone familiar with this code ( git history suggests maybe Ariel Weisberg or Branimir Lambov ) interested in reviewing?

          Show
          jjirsa Jeff Jirsa added a comment - Anyone familiar with this code ( git history suggests maybe Ariel Weisberg or Branimir Lambov ) interested in reviewing?
          Hide
          aweisberg Ariel Weisberg added a comment -

          I thought that LockSupport was kind of low level when I first saw that, but I didn't want to bike shed. TIL. It was using a semaphore before, and I don't remember why we switched. I approve of the change I just want to make sure Branimir Lambov didn't have a specific reason for making the switch.

          Show
          aweisberg Ariel Weisberg added a comment - I thought that LockSupport was kind of low level when I first saw that, but I didn't want to bike shed. TIL. It was using a semaphore before, and I don't remember why we switched. I approve of the change I just want to make sure Branimir Lambov didn't have a specific reason for making the switch.
          Hide
          Fuud Fuud added a comment -

          Just to keep all things together, copy from mailing list
          http://www.mail-archive.com/dev@cassandra.apache.org/msg11313.html

          -------------------------------------------------------------

          Hello,

          I found possible deadlock in AbstractCommitLogSegmentManager. The root cause is incorrect use of LockSupport.park/unpark pair. Unpark should be invoked only if caller is sure that thread was parked in appropriate place. Otherwice permission given by calling unpark can be consumed by other structures (for example - inside ReadWriteLock).

          Jira: https://issues.apache.org/jira/browse/CASSANDRA-13652

          I suggest simplest solution: change LockSupport to Semaphore.
          PR: https://github.com/apache/cassandra/pull/127

          Also I suggest another solution with SynchronousQueue-like structure to move available segment from Manager Thread to consumers. With theese changes code became more clear and
          straightforward.

          PR https://github.com/apache/cassandra/pull/129

          We can not use j.u.c.SynchronousQueue because we need to support shutdown and there is only way to terminate SynchronousQueue.put is to call Thread.interrupt(). But C* uses nio and it does not expect ClosedByInterruptException during IO operations. Thus we can not interrupt Manager Thread.
          I implemented o.a.c.u.c.Transferer that supports shutdown and restart (needed for tests).
          https://github.com/Fuud/cassandra/blob/e1a695874dc24e532ae21ef627e852bf999a75f3/src/java/org/apache/cassandra/utils/concurrent/Transferer.java

          Also I modified o.a.c.d.c.SimpleCachedBufferPool to support waiting for free space.

          Please feel free to ask any questions.

          Thank you.

          Feodor Bobin
          fuudtorrentsru@gmail.com

          Show
          Fuud Fuud added a comment - Just to keep all things together, copy from mailing list http://www.mail-archive.com/dev@cassandra.apache.org/msg11313.html ------------------------------------------------------------- Hello, I found possible deadlock in AbstractCommitLogSegmentManager. The root cause is incorrect use of LockSupport.park/unpark pair. Unpark should be invoked only if caller is sure that thread was parked in appropriate place. Otherwice permission given by calling unpark can be consumed by other structures (for example - inside ReadWriteLock). Jira: https://issues.apache.org/jira/browse/CASSANDRA-13652 I suggest simplest solution: change LockSupport to Semaphore. PR: https://github.com/apache/cassandra/pull/127 Also I suggest another solution with SynchronousQueue-like structure to move available segment from Manager Thread to consumers. With theese changes code became more clear and straightforward. PR https://github.com/apache/cassandra/pull/129 We can not use j.u.c.SynchronousQueue because we need to support shutdown and there is only way to terminate SynchronousQueue.put is to call Thread.interrupt(). But C* uses nio and it does not expect ClosedByInterruptException during IO operations. Thus we can not interrupt Manager Thread. I implemented o.a.c.u.c.Transferer that supports shutdown and restart (needed for tests). https://github.com/Fuud/cassandra/blob/e1a695874dc24e532ae21ef627e852bf999a75f3/src/java/org/apache/cassandra/utils/concurrent/Transferer.java Also I modified o.a.c.d.c.SimpleCachedBufferPool to support waiting for free space. Please feel free to ask any questions. Thank you. Feodor Bobin fuudtorrentsru@gmail.com
          Hide
          blambov Branimir Lambov added a comment -

          The park call at line 130 is indeed suspect, as it does not check there's no action to perform before parking. I would solve the problem by dropping that call, which would make the usage of park/unpark conform to the specifications.

          Show
          blambov Branimir Lambov added a comment - The park call at line 130 is indeed suspect, as it does not check there's no action to perform before parking. I would solve the problem by dropping that call, which would make the usage of park/unpark conform to the specifications.
          Hide
          aweisberg Ariel Weisberg added a comment -

          Ah you are right it's the lack of the condition check that causes the problem. I think LockSupport park/unpark is fine it's just a thread specific semaphore bounded to a single permit.

          It's technically ok if other usages of park are woken because spurious wakeups are part of the specification so other usages should handle it.

          Show
          aweisberg Ariel Weisberg added a comment - Ah you are right it's the lack of the condition check that causes the problem. I think LockSupport park/unpark is fine it's just a thread specific semaphore bounded to a single permit. It's technically ok if other usages of park are woken because spurious wakeups are part of the specification so other usages should handle it.
          Hide
          aweisberg Ariel Weisberg added a comment -

          Although TBH thinking on it why tempt fate with LockSupport.unpark without checking the thread is actually blocked on what we think it is? Let's go with the semaphore and drop the wait at line 130.

          Show
          aweisberg Ariel Weisberg added a comment - Although TBH thinking on it why tempt fate with LockSupport.unpark without checking the thread is actually blocked on what we think it is? Let's go with the semaphore and drop the wait at line 130.
          Hide
          Fuud Fuud added a comment -

          Ariel Weisberg
          >>It's technically ok if other usages of park are woken because spurious wakeups are part of the specification so other usages should handle it.

          Yes. But such other useages will eat permit and Manager Thread will be blocked in our code.

          Show
          Fuud Fuud added a comment - Ariel Weisberg >>It's technically ok if other usages of park are woken because spurious wakeups are part of the specification so other usages should handle it. Yes. But such other useages will eat permit and Manager Thread will be blocked in our code.
          Hide
          blambov Branimir Lambov added a comment -

          Yes. But such other useages will eat permit and Manager Thread will be blocked in our code.

          This is only relevant if the eating happens before checking we need to park and actually calling LockSupport.park(). This is indeed possible for the line 130 call, which is why it should be removed and the control allowed to continue to the checked park on line 146.

          Show
          blambov Branimir Lambov added a comment - Yes. But such other useages will eat permit and Manager Thread will be blocked in our code. This is only relevant if the eating happens before checking we need to park and actually calling LockSupport.park() . This is indeed possible for the line 130 call, which is why it should be removed and the control allowed to continue to the checked park on line 146.
          Hide
          aweisberg Ariel Weisberg added a comment -

          Most usages of LockSupport don't introduce (as many) spurious unparks because they check to see of the thread is blocked on the specific condition not just any condition. I want to use a Semaphore so we don't introduce spurious unparks and trigger bugs in other usages of LockSupport.park. I don't see a huge advantage to working with LockSupport directly.

          Yes either way blocking without checking the condition at line 130 has to go.

          Show
          aweisberg Ariel Weisberg added a comment - Most usages of LockSupport don't introduce (as many) spurious unparks because they check to see of the thread is blocked on the specific condition not just any condition. I want to use a Semaphore so we don't introduce spurious unparks and trigger bugs in other usages of LockSupport.park. I don't see a huge advantage to working with LockSupport directly. Yes either way blocking without checking the condition at line 130 has to go.
          Hide
          blambov Branimir Lambov added a comment -

          The reason to prefer park for me is understandability of the code. This is a loop that does some work and pauses when there's no need to do any, a perfect candidate for park/unpark.

          Semaphore, although applicable, implies something else. Our WaitQueue is a better alternative for this kind of application.

          Show
          blambov Branimir Lambov added a comment - The reason to prefer park for me is understandability of the code. This is a loop that does some work and pauses when there's no need to do any, a perfect candidate for park/unpark. Semaphore , although applicable, implies something else. Our WaitQueue is a better alternative for this kind of application.
          Hide
          aweisberg Ariel Weisberg added a comment -

          WaitQueue seems even more obtuse? It's just a condition variable we are looking for so why not ReentrantLock and Condition or synchronized and wait/notify?

          I mean I am fine with it. People working on Cassandra need to learn how WaitQueue works at some point. It just seems like a performance optimization to avoid locking.

          Show
          aweisberg Ariel Weisberg added a comment - WaitQueue seems even more obtuse? It's just a condition variable we are looking for so why not ReentrantLock and Condition or synchronized and wait/notify? I mean I am fine with it. People working on Cassandra need to learn how WaitQueue works at some point. It just seems like a performance optimization to avoid locking.
          Hide
          aweisberg Ariel Weisberg added a comment -

          OK, how about this?

          3.11 code, utests, dtests

          Fuud are you ok with me rewriting your patch this way?

          Show
          aweisberg Ariel Weisberg added a comment - OK, how about this? 3.11 code , utests , dtests Fuud are you ok with me rewriting your patch this way?
          Hide
          Fuud Fuud added a comment -

          Yes. Seems good.

          Show
          Fuud Fuud added a comment - Yes. Seems good.
          Hide
          blambov Branimir Lambov added a comment -

          This looks good to me.

          Show
          blambov Branimir Lambov added a comment - This looks good to me.
          Hide
          aweisberg Ariel Weisberg added a comment -
          Show
          aweisberg Ariel Weisberg added a comment - Committed as eb717f154bd24453273d7175006fdef75e5724c2 . Thanks.

            People

            • Assignee:
              Unassigned
              Reporter:
              Fuud Fuud
              Reviewer:
              Ariel Weisberg
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development