Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-15295

Running into deadlock when do CommitLog initialization

    XMLWordPrintableJSON

Details

    Description

      Recently, I found a cassandra(3.11.4) node stuck in STARTING status for a long time.
      I used jstack to saw what happened. The main thread stuck in AbstractCommitLogSegmentManager.awaitAvailableSegment

      The strange thing is COMMIT-LOG-ALLOCATOR thread state was runnable but it was not actually running. 

      And then I used pstack to troubleshoot. I found COMMIT-LOG-ALLOCATOR block on java class initialization.

      This is a deadlock obviously. CommitLog waits for a CommitLogSegment when initializing. In this moment, the CommitLog class is not initialized and the main thread holds the class lock. After that, COMMIT-LOG-ALLOCATOR creates a CommitLogSegment with exception and call CommitLog.handleCommitError(static method). COMMIT-LOG-ALLOCATOR will block on this line because CommitLog class is still initializing.

       

       

      Attachments

        1. jstack.log
          15 kB
          Zephyr Guo
        2. pstack.log
          34 kB
          Zephyr Guo
        3. screenshot-1.png
          865 kB
          Zephyr Guo
        4. screenshot-2.png
          224 kB
          Zephyr Guo
        5. screenshot-3.png
          569 kB
          Zephyr Guo
        6. image.png
          583 kB
          David Capwell

        Issue Links

          Activity

            People

              gzh1992n Zephyr Guo
              gzh1992n Zephyr Guo
              Dinesh Joshi, Zephyr Guo
              Dinesh Joshi, Jordan West
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m