Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Invalid
-
None
-
None
-
None
Description
2018-02-20 16:36:41,794 ERROR [Thread-85] assignment.AssignmentManager: java.lang.NullPointerException java.lang.NullPointerException at org.apache.hadoop.hbase.util.VersionInfo.compareVersion(VersionInfo.java:122) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.lambda$getExcludedServersForSystemTable$5(AssignmentManager.java:1860) at java.util.Collections.max(Collections.java:712) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.getExcludedServersForSystemTable(AssignmentManager.java:1859) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.lambda$checkIfShouldMoveSystemRegionAsync$0(AssignmentManager.java:464)
Looks like a race condition around an RS losing its ZK lock. If AM tries to see if it should move a Region to a server who we've seen that the lock was lost but the RS hasn't yet been processed as "dead", we can get into a situation where HMaster.getRegionServerVersion() returns null and causes this to fail.
Looks like a simple filter on the servers to preclude null versions would fix the problem.