-
Type:
Sub-task
-
Status: Resolved
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: None
-
Fix Version/s: HA branch (HDFS-1623)
-
Labels:None
-
Target Version/s:
-
Hadoop Flags:Reviewed
When starting an HA blockpool with both namenodes up, I often see an NPE, referenced by one of the TODOs. The issue is the following interleaving:
- first BPActor registers, and sets bpNSInfo in BPOfferService. It then proceeds to initFsDataset which takes a little bit of time
- second BPActor registers, and sees bpNSInfo is non-null, then proceeds to heartbeat loop. Meanwhile BPActor 1 is still initting FSDataset
- second BPActor gets an NPE on first heartbeat since fsdataset is still null.
We just need to synchronize that function to fix the NPE.