Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-8717

Flink seems to deadlock due to buffer starvation when iterating

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Not A Bug
    • 1.4.0
    • None
    • API / DataStream
    • None
    • Windows 10 Pro 64-bit

      Core i7-6820HQ @ 2.7 GHz

      16GB RAM

      Flink 1.4

      Scala client

      Scala 2.11.7

       

    Description

      We are encountering what looks like a deadlock of Flink in one of our jobs with an "iterate" in it.

      I've reduced the job use case to the example in this gist : https://gist.github.com/rrevol/06ddfecd5f5ac7cbc67785b5d3a84dd4

      Nothe that :

      • varying the parallelism affects the rapidity of occurence of the deadlock, but it always occur
      • varying MAX_LOOP_NB does affect the deadlock : the higher it is, the faster we encounter the deadlock. If MAX_LOOP_NB == 1, no deadlock. It consequently leads to think that it happens when the number of iterations reaches some threshold.

      From the threadDump.txt, it looks like some starvation over buffer allocation, maybe backpressure has flaws on iterate, but I may be mistaking since I don't know well Flink internals.

      Attachments

        1. threadDump.txt
          94 kB
          Romain Revol

        Issue Links

          Activity

            People

              Unassigned Unassigned
              rrevol Romain Revol
              Votes:
              2 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: