Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-5639

The logic used in waiting for region servers during startup is broken

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • None
    • 0.94.0
    • None
    • None
    • Reviewed

    Description

      See the tail of HBASE-4993, which I'll report here:

      Me:

      I think a bug was introduced here. Here's the new waiting logic in waitForRegionServers:

      the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
      there have been no new region server in for
      'hbase.master.wait.on.regionservers.interval' time

      And the code that verifies that:

      !(lastCountChange+interval > now && count >= minToStart)

      Nic:

      It seems that changing the code to

      (count < minToStart ||
      lastCountChange+interval > now)

      would make the code works as documented.
      If you have 0 region servers that checked in and you are under the interval, you wait: (true or true) = true.
      If you have 0 region servers but you are above the interval, you wait: (true or false) = true.
      If you have 1 or more region servers that checked in and you are under the interval, you wait: (false or true) = true.

      Attachments

        1. HBASE-5639.patch
          0.6 kB
          Jean-Daniel Cryans

        Activity

          People

            jdcryans Jean-Daniel Cryans
            jdcryans Jean-Daniel Cryans
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: