Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-6171

Native facility to control excessive GC pauses

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.3
    • Fix Version/s: 2.5
    • Component/s: general
    • Labels:

      Description

      Ignite is Java-based application. If node experiences long GC pauses it may negatively affect other nodes. We need to find a way to detect long GC pauses within the process and trigger some actions in response, e.g. node stop.

      This is a kind of Inception [1], when you need to understand that you sleep while sleeping. As all Java threads are blocked on safepoint, we cannot use Java's thread to detect Java's GC. Native threads should be used instead.

      Proposed solution:
      1) Thread 1 should periodically call dummy JNI method returning current time, and set this time to shared variable;
      2) Thread 2 should periodically check that variable. If it has not been changed for some time - most likely we are in GC pause. Once certain threashold is reached - trigger compensating action, whether this is a warning, process kill, or what so ever.

      Justification: crossing native -> Java boundaries involves safepoints. This way Thread 1 will be trapped if STW pause is in progress. Java method cannot be empty, as JVM is smart enough and can deduce it to no-op.

      [1] http://www.imdb.com/title/tt1375666/

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                cyberdemon Dmitriy Sorokin
                Reporter:
                vozerov Vladimir Ozerov
              • Votes:
                1 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: