Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-32738

thread safe endpoints may hang due to fatal error

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.3.4, 2.4.6, 3.0.0
    • 2.4.8, 3.0.2, 3.1.0
    • Spark Core
    • None

    Description

      Processing for `ThreadSafeRpcEndpoint` is controlled by 'numActiveThreads' in `Inbox`. Now if any fatal error happens during `Inbox.process`, 'numActiveThreads' is not reduced. Then other threads can not process messages in that inbox, which causes the endpoint to "hang".

      This problem is more serious in previous Spark 2.x versions since the driver, executor and block manager endpoints are all thread safe endpoints.

      Attachments

        Activity

          People

            zhenhuawang Zhenhua Wang
            zhenhuawang Zhenhua Wang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: