Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-7867

Flush thread gets stuck when input stream of binaries block

    XMLWordPrintableJSON

    Details

      Description

      This issue tackles the root cause of the sever data loss that has been reported in OAK-7852:

      When a the input stream of a binary value blocks indefinitely on read the flush thread of the segment store get blocked:

      "pool-2-thread-1" #15 prio=5 os_prio=31 tid=0x00007fb0f21e3000 nid=0x5f03 waiting on condition [0x000070000a46d000]
      java.lang.Thread.State: WAITING (parking)
      at sun.misc.Unsafe.park(Native Method)
      - parking to wait for  <0x000000076bba62b0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
      at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
      at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
      at com.google.common.util.concurrent.Monitor.await(Monitor.java:963)
      at com.google.common.util.concurrent.Monitor.enterWhen(Monitor.java:402)
      at org.apache.jackrabbit.oak.segment.SegmentBufferWriterPool.safeEnterWhen(SegmentBufferWriterPool.java:179)
      at org.apache.jackrabbit.oak.segment.SegmentBufferWriterPool.flush(SegmentBufferWriterPool.java:138)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter.flush(DefaultSegmentWriter.java:138)
      at org.apache.jackrabbit.oak.segment.file.FileStore.lambda$doFlush$8(FileStore.java:307)
      at org.apache.jackrabbit.oak.segment.file.FileStore$$Lambda$22/1345968304.flush(Unknown Source)
      at org.apache.jackrabbit.oak.segment.file.TarRevisions.doFlush(TarRevisions.java:237)
      at org.apache.jackrabbit.oak.segment.file.TarRevisions.flush(TarRevisions.java:195)
      at org.apache.jackrabbit.oak.segment.file.FileStore.doFlush(FileStore.java:306)
      at org.apache.jackrabbit.oak.segment.file.FileStore.flush(FileStore.java:318)
      

      The condition 0x000070000a46d000 is waiting for the following thread to return its SegmentBufferWriter, which will never happen if InputStream.read(...) does not progress.

      "pool-1-thread-1" #14 prio=5 os_prio=31 tid=0x00007fb0f223a800 nid=0x5d03 runnable [0x000070000a369000
      ] java.lang.Thread.State: RUNNABLE
      at com.google.common.io.ByteStreams.read(ByteStreams.java:833)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.internalWriteStream(DefaultSegmentWriter.java:641)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.writeStream(DefaultSegmentWriter.java:618)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.writeBlob(DefaultSegmentWriter.java:577)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.writeProperty(DefaultSegmentWriter.java:691)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.writeProperty(DefaultSegmentWriter.java:677)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.writeNodeUncached(DefaultSegmentWriter.java:900)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.writeNode(DefaultSegmentWriter.java:799)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$SegmentWriteOperation.access$800(DefaultSegmentWriter.java:252)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter$8.execute(DefaultSegmentWriter.java:240)
      at org.apache.jackrabbit.oak.segment.SegmentBufferWriterPool.execute(SegmentBufferWriterPool.java:105)
      at org.apache.jackrabbit.oak.segment.DefaultSegmentWriter.writeNode(DefaultSegmentWriter.java:235)
      at org.apache.jackrabbit.oak.segment.SegmentWriter.writeNode(SegmentWriter.java:79)
      

       

      This issue is critical as such a misbehaving input stream causes the flush thread to get stuck preventing transient segments from being flushed and thus causing data loss.

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                mduerig Michael Dürig
                Reporter:
                mduerig Michael Dürig
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: