Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-39283

Spark tasks stuck forever due to deadlock between TaskMemoryManager and UnsafeExternalSorter

    XMLWordPrintableJSON

Details

    Description

      We are seems this deadlock between TaskMemoryManager and UnsafeExternalSorter pretty often on our workload. Sometime, the retry is successful but sometimes we have to do hacky ways to break the deadlocks such as turning down the worker machines explicitly. 

      Below is the thread dump from the Spark UI showing the deadlock :

       

      I believe there was a related Jira on the similar deadlock between the same threads and it was resolved. 
      https://issues.apache.org/jira/browse/SPARK-27338

       

       

      Attachments

        1. DeadlockSparkTasks.png
          243 kB
          Sandeep Pal

        Activity

          People

            sandeep.pal Sandeep Pal
            sandeep.pal Sandeep Pal
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: