Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
Reviewed
Description
When starting an HA blockpool with both namenodes up, I often see an NPE, referenced by one of the TODOs. The issue is the following interleaving:
- first BPActor registers, and sets bpNSInfo in BPOfferService. It then proceeds to initFsDataset which takes a little bit of time
- second BPActor registers, and sees bpNSInfo is non-null, then proceeds to heartbeat loop. Meanwhile BPActor 1 is still initting FSDataset
- second BPActor gets an NPE on first heartbeat since fsdataset is still null.
We just need to synchronize that function to fix the NPE.