Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-6290

Add a function a mark a server as dead and start the recovery the process

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 0.95.2
    • 2.0.0
    • monitoring
    • Reviewed
    • Hide
      Adds a script to mark a server as dead.

      Usage: considerAsDead.sh --hostname serverName
      Show
      Adds a script to mark a server as dead. Usage: considerAsDead.sh --hostname serverName
    • beginner

    Description

      ZooKeeper is used a a monitoring tool: we use znode and we start the recovery process when a znode is deleted by ZK because it got a timeout. This timeout is defaulted to 90 seconds, and often set to 30s

      However, some HW issues could be detected by specialized hw monitoring tools before the ZK timeout. For this reason, it makes sense to offer a very simple function to mark a RS as dead. This should not take in

      It could be a hbase shell function such as
      considerAsDead ipAddress|serverName

      This would delete all the znodes of the server running on this box, starting the recovery process.

      Such a function would be easily callable (at callers risk) by any fault detection tool... We could have issues to identify the right master & region servers around ipv4 vs ipv6 vs and multi networked boxes however.

      Attachments

        1. 6290.doc.addendum.txt
          1 kB
          Michael Stack
        2. 6290v2.txt
          2 kB
          Michael Stack
        3. HBASE-6290.patch
          2 kB
          Talat Uyarer

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            talat Talat Uyarer
            nkeywal Nicolas Liochon
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment