Uploaded image for project: 'Slider'
  1. Slider
  2. SLIDER-1161

Improve regionserver status check in HBase Slider app package

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • Slider 0.80
    • None
    • app-package
    • None
    • RHEL-6 (64 Bit)

    • Important

    Description

      PROBLEM :

      Using slider for launching Hbase containers.
      Following is the problem statement and details :
      1. Assume region server went into a big pause and lost its heartbeat with zookeeper
      2. HMaster notices this and marks the region server as DEAD
      3. However, slider agent continues to 'ps' the region server process in every heartbeat.monitor.interval (45000ms in my case) and because it is just checking for region server process being alive, it does not consider it dead
      4. After that big delay, region server finally recovers and goes to HMaster
      5. HMaster informs region server YouAreAlreadyDeadException
      6. Now, this region server brings itself down and slider also notices that process is no longer running.
      7. Slider now launches a new region server.

      The issue as clearly mentioned in steps above is that there can be a huge delay between step 4 and 6. This means that we are now operating with lesser region servers and this puts more and more load on existing region servers.

      The issue can be solved if slider would sync up with HMaster to find whether region server is alive or not. That way, it would immediately know that HMaster has already marked a region server as dead and will then bring down the region server and launch a new one.

      Attachments

        Activity

          People

            Unassigned Unassigned
            Sandeep Nemuri Sandeep Nemuri
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: