Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Namenode calls seem to hang and the retry logic never properly kicks in. It looks like there's a race condition that leads to a failed rpc call never properly passing the request object to the new RpcConnectionImpl so things hang forever. RpcConnectionImpl objects are also kept alive longer than they should because of a shared_ptr cycle between them and the timeout tracking objects.