Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-20028

NPE when comparing versions in AM after RS ZK expiration

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • None
    • None
    • master
    • None

    Description

      2018-02-20 16:36:41,794 ERROR [Thread-85] assignment.AssignmentManager: java.lang.NullPointerException
      java.lang.NullPointerException
      	at org.apache.hadoop.hbase.util.VersionInfo.compareVersion(VersionInfo.java:122)
      	at org.apache.hadoop.hbase.master.assignment.AssignmentManager.lambda$getExcludedServersForSystemTable$5(AssignmentManager.java:1860)
      	at java.util.Collections.max(Collections.java:712)
      	at org.apache.hadoop.hbase.master.assignment.AssignmentManager.getExcludedServersForSystemTable(AssignmentManager.java:1859)
      	at org.apache.hadoop.hbase.master.assignment.AssignmentManager.lambda$checkIfShouldMoveSystemRegionAsync$0(AssignmentManager.java:464)

      Looks like a race condition around an RS losing its ZK lock. If AM tries to see if it should move a Region to a server who we've seen that the lock was lost but the RS hasn't yet been processed as "dead", we can get into a situation where HMaster.getRegionServerVersion() returns null and causes this to fail.

      Looks like a simple filter on the servers to preclude null versions would fix the problem.

      Attachments

        Activity

          People

            elserj Josh Elser
            elserj Josh Elser
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: