Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-34522 Issue Tracker for JDK related Bugs
  3. SPARK-34523

JDK-8194653: Deadlock involving FileSystems.getDefault and System.loadLibrary call

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Information Provided
    • Affects Version/s: 2.4.7, 3.0.2, 3.1.1
    • Fix Version/s: None
    • Component/s: Spark Core
    • Labels:
      None
    • External issue ID:
      JDK-8194653

      Description

      Instruction

      This will cause deadlock and hangs concurrent tasks forever on the same executor. for example,

      In the Spark UI stage tab, you may find some of the tasks hang for hours and all others complete without delay.

      Also, you may find that these hanging tasks belong to the same executors.
      Usually, in this case, you will also get nothing helpful from the executor log.

      If you print the executor jstack or you check the ThreadDump via SparkUI executor tab and you find some task thread blocked like below, you are very likely to hit the JDK-8194653 issue.

      Solutions

      Here are some options to circumvent this problem:

      1. For the cluster managers side, you can update the JDK version according to https://bugs.openjdk.java.net/browse/JDK-8194653
      2. If you are not able to update the JDK version for the cluster entirely, you can use `spark.executorEnv.JAVA_HOME` to specify a suitable JRE for your apps
      2. Also, turn on `spark.speculation` may let spark automatically re-run the hanging tasks and bypass the problem

        Attachments

        1. screenshot-2.png
          132 kB
          Kent Yao
        2. screenshot-1.png
          167 kB
          Kent Yao
        3. 4303.log
          39 kB
          Kent Yao

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                Qin Yao Kent Yao
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: