Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-5228

LocalInputChannel re-trigger request and release deadlock

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • None
    • 1.1.4, 1.2.0
    • Runtime / Network
    • None

    Description

      Concurrent release and re-triggering of a partition request can lead to a deadlock.

      Found one Java-level deadlock:
      =============================
      "Canceler for Map -> Sink: Unnamed (1/4)":
      waiting to lock monitor 0x0000000001e27bd8 (object 0x00000000ffa1f688, a java.lang.Object),
      which is held by "Timer-3"
      "Timer-3":
      waiting to lock monitor 0x00007fdbd029ec48 (object 0x00000000ffa1f3a0, a java.lang.Object),
      which is held by "Canceler for Map -> Sink: Unnamed (1/4)"
      
      Java stack information for the threads listed above:
      ===================================================
      "Canceler for Map -> Sink: Unnamed (1/4)":
         at org.apache.flink.runtime.io.network.partition.consumer.LocalInputChannel.releaseAllResources(LocalInputChannel.java:240)
         - waiting to lock <0x00000000ffa1f688> (a java.lang.Object)
         at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.releaseAllResources(SingleInputGate.java:348)
         - locked <0x00000000ffa1f3a0> (a java.lang.Object)
         at org.apache.flink.runtime.taskmanager.Task$TaskCanceler.run(Task.java:1280)
         at java.lang.Thread.run(Thread.java:745)
      "Timer-3":
         at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.retriggerPartitionRequest(SingleInputGate.java:307)
         - waiting to lock <0x00000000ffa1f3a0> (a java.lang.Object)
         at org.apache.flink.runtime.io.network.partition.consumer.LocalInputChannel.requestSubpartition(LocalInputChannel.java:128)
         - locked <0x00000000ffa1f688> (a java.lang.Object)
         at org.apache.flink.runtime.io.network.partition.consumer.LocalInputChannel$1.run(LocalInputChannel.java:148)
         at java.util.TimerThread.mainLoop(Timer.java:555)
         at java.util.TimerThread.run(Timer.java:505)
      

      Attachments

        Activity

          People

            uce Ufuk Celebi
            uce Ufuk Celebi
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: