Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-8986

Segment flush thread can remain in TIMED_WAITING state even when segment queue is empty

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.24.0, 1.26.0
    • Fix Version/s: 1.30.0
    • Component/s: segment-azure
    • Labels:
      None
    • Flags:
      Patch, Important

      Description

      If thread is in interrupted state, during execution of SegmentWriteQueue. addToQueue InterruptedException will be thrown and wrapped in IOException.

      Right befire calling queue.offer, element is added to segmentsByUUID map, and never removed.
      Normally that happens in thread that reads from queue, and that invokes consume(SegmentWriteAction segment).

      Since item is not removed form the segmentsByUUID map, flusher thread will remain in TIMED_WAITING state.

      TarMK flush thread holds exclusivelly monitor needed by number of other threads, causing repository to be blocked.

      "TarMK flush [/opt/aem/launcher/repository/segmentstore-composite-global]" #82 daemon prio=5 os_prio=0 cpu=83628.24ms elapsed=291420.48s tid=0x00007fce902f3000 nid=0x1c2b in Object.wait()  [0x00007fce00aa5000]
         java.lang.Thread.State: TIMED_WAITING (on object monitor)
      	at java.lang.Object.wait(java.base@11.0.3/Native Method)
      	- waiting on <no object reference available>
      	at org.apache.jackrabbit.oak.segment.azure.queue.SegmentWriteQueue.flush(SegmentWriteQueue.java:183)
      	- waiting to re-lock in wait() <0x00000006b4911830> (a java.util.concurrent.ConcurrentHashMap)
      	at org.apache.jackrabbit.oak.segment.azure.AzureSegmentArchiveWriter.flush(AzureSegmentArchiveWriter.java:187)
      	at org.apache.jackrabbit.oak.segment.file.tar.TarWriter.flush(TarWriter.java:186)
      	- locked <0x00000006b4911960> (a java.lang.Object)
      	at org.apache.jackrabbit.oak.segment.file.tar.TarFiles.flush(TarFiles.java:535)
      	at org.apache.jackrabbit.oak.segment.file.FileStore.lambda$tryFlush$9(FileStore.java:359)
      	at org.apache.jackrabbit.oak.segment.file.FileStore$$Lambda$232/0x000000080067ac40.flush(Unknown Source)
      	at org.apache.jackrabbit.oak.segment.file.TarRevisions.doFlush(TarRevisions.java:236)
      	at org.apache.jackrabbit.oak.segment.file.TarRevisions.tryFlush(TarRevisions.java:216)
      	at org.apache.jackrabbit.oak.segment.file.FileStore.tryFlush(FileStore.java:357)
      	at org.apache.jackrabbit.oak.segment.file.FileStore.lambda$new$5(FileStore.java:212)
      	at org.apache.jackrabbit.oak.segment.file.FileStore$$Lambda$203/0x000000080064b440.run(Unknown Source)
      	at org.apache.jackrabbit.oak.segment.file.SafeRunnable.run(SafeRunnable.java:67)
      	at java.util.concurrent.Executors$RunnableAdapter.call(java.base@11.0.3/Executors.java:515)
      	at java.util.concurrent.FutureTask.runAndReset(java.base@11.0.3/FutureTask.java:305)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(java.base@11.0.3/ScheduledThreadPoolExecutor.java:305)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.3/ThreadPoolExecutor.java:1128)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.3/ThreadPoolExecutor.java:628)
      	at java.lang.Thread.run(java.base@11.0.3/Thread.java:834)
      
      

      Here is the test case that demonstrates the problem. 

      test.patch

        Attachments

        1. test.patch
          4 kB
          Miroslav Smiljanic
        2. proposed_patch.patch
          0.8 kB
          Miroslav Smiljanic
        3. test_and_proposed_patch.patch
          4 kB
          Miroslav Smiljanic
        4. OAK-8986.patch
          4 kB
          Marcel Reutegger

          Activity

            People

            • Assignee:
              mreutegg Marcel Reutegger
              Reporter:
              smiroslav Miroslav Smiljanic
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: