Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.3.4, 2.4.6, 3.0.0
-
None
Description
Processing for `ThreadSafeRpcEndpoint` is controlled by 'numActiveThreads' in `Inbox`. Now if any fatal error happens during `Inbox.process`, 'numActiveThreads' is not reduced. Then other threads can not process messages in that inbox, which causes the endpoint to "hang".
This problem is more serious in previous Spark 2.x versions since the driver, executor and block manager endpoints are all thread safe endpoints.