Here's a deadlock scenario that cropped up during pipeline recovery, debugged through jstacks. Todd tipped me off to this one.
- Pipeline fails, client initiates recovery. We have the old leftover DataXceiver, and a new one doing recovery.
- New DataXceiver does recoverRbw, grabbing the FsDatasetImpl lock
- Old DataXceiver is in BlockReceiver#computePartialChunkCrc, calls FsDatasetImpl#getTmpInputStreams and blocks on the FsDatasetImpl lock.
- New DataXceiver ReplicaInPipeline#stopWriter, interrupting the old DataXceiver and then joining on it.
- Boom, deadlock. New DX holds the FsDatasetImpl lock and is joining on the old DX, which is in turn waiting on the FsDatasetImpl lock.