Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-7867

Flush thread gets stuck when input stream of binaries block

    XMLWordPrintableJSON

Details

    Description

      This issue tackles the root cause of the sever data loss that has been reported in OAK-7852:

      When a the input stream of a binary value blocks indefinitely on read the flush thread of the segment store get blocked:

      "pool-2-thread-1" #15 prio=5 os_prio=31 tid=0x00007fb0f21e3000 nid=0x5f03 waiting on condition [0x000070000a46d000]
      java.lang.Thread.State: WAITING (parking)
      at sun.misc.Unsafe.park(Native Method)
      - parking to wait for  <0x000000076bba62b0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
      at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
      at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
      at com.google.common.util.concurrent.Monitor.await(Monitor.java:963)
      at com.google.common.util.concurrent.Monitor.enterWhen(Monitor.java:402)
      at org.apache.jackrabbit.oak.segment.SegmentBufferWriterPool.safeEnterWhen(SegmentBufferWriterPool.java:179)
      at org.apache.jackrabbit.oak.segment.SegmentBufferWriterPool.flush(SegmentBufferWriterPool.java:138)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter.flush(DefaultSegmentWriter.java:138)
      at org.apache.jackrabbit.oak.segment.file.FileStore.lambda$doFlush$8(FileStore.java:307)
      at org.apache.jackrabbit.oak.segment.file.FileStore$$Lambda$22/1345968304.flush(Unknown Source)
      at org.apache.jackrabbit.oak.segment.file.TarRevisions.doFlush(TarRevisions.java:237)
      at org.apache.jackrabbit.oak.segment.file.TarRevisions.flush(TarRevisions.java:195)
      at org.apache.jackrabbit.oak.segment.file.FileStore.doFlush(FileStore.java:306)
      at org.apache.jackrabbit.oak.segment.file.FileStore.flush(FileStore.java:318)
      

      The condition 0x000070000a46d000 is waiting for the following thread to return its SegmentBufferWriter, which will never happen if InputStream.read(...) does not progress.

      "pool-1-thread-1" #14 prio=5 os_prio=31 tid=0x00007fb0f223a800 nid=0x5d03 runnable [0x000070000a369000
      ] java.lang.Thread.State: RUNNABLE
      at com.google.common.io.ByteStreams.read(ByteStreams.java:833)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.internalWriteStream(DefaultSegmentWriter.java:641)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.writeStream(DefaultSegmentWriter.java:618)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.writeBlob(DefaultSegmentWriter.java:577)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.writeProperty(DefaultSegmentWriter.java:691)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.writeProperty(DefaultSegmentWriter.java:677)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.writeNodeUncached(DefaultSegmentWriter.java:900)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.writeNode(DefaultSegmentWriter.java:799)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.access$800(DefaultSegmentWriter.java:252)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$8.execute(DefaultSegmentWriter.java:240)
      at org.apache.jackrabbit.oak.segment.SegmentBufferWriterPool.execute(SegmentBufferWriterPool.java:105)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter.writeNode(DefaultSegmentWriter.java:235)
      at org.apache.jackrabbit.oak.segment.SegmentWriter.writeNode(SegmentWriter.java:79)
      

       

      This issue is critical as such a misbehaving input stream causes the flush thread to get stuck preventing transient segments from being flushed and thus causing data loss.

       

      Attachments

        Issue Links

          Activity

            People

              mduerig Michael Dürig
              mduerig Michael Dürig
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: