Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-8856

Move all interrupt() calls to TaskCanceler

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 1.4.0, 1.4.1, 1.4.2
    • 1.4.3, 1.5.0
    • Runtime / Coordination
    • None

    Description

      We need this to work around the following JVM bug: https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8138622

      To circumvent this problem, the TaskCancelerWatchDog must not call interrupt() at all, but only join on the executing thread (with timeout) and cause a hard exit once cancellation takes to long.

      A user affected by this problem reported this in FLINK-8834

      Personal note: The Thread.join(...) method unfortunately is not 100% reliable as well, because it uses System.currentTimeMillis() rather than System.nanoTime(). Because of that, sleeps can take overly long when the clock is adjusted. I wonder why the JDK authors do not follow their own recommendations and use System.nanoTime() for all relative time measures...

      EDIT: I am not the only one wondering why: https://stackoverflow.com/questions/42544387/why-does-thread-join-use-currenttimemillis

      Attachments

        Issue Links

          Activity

            People

              sewen Stephan Ewen
              sewen Stephan Ewen
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: