Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-5939

Add an autorestart option in the start scripts

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 0.95.2
    • 0.95.0
    • master, regionserver, scripts
    • None
    • Reviewed
    • Hide
      When launched with autorestart, HBase processes will automatically restart if they are not properly terminated, either by a "stop" command or by a cluster stop. To ensure that it does not overload the system when the server itself is corrupted and the process cannot be restarted, the server sleeps for 5 minutes before restarting if it was already started 5 minutes ago previously. To use it, launch the process with "bin/start-hbase autorestart". This option is not fully compatible with the existing "restart" command: if you ask for a restart on a server launched with autorestart, the server will restart but the next server instance won't be automatically restarted.
      Show
      When launched with autorestart, HBase processes will automatically restart if they are not properly terminated, either by a "stop" command or by a cluster stop. To ensure that it does not overload the system when the server itself is corrupted and the process cannot be restarted, the server sleeps for 5 minutes before restarting if it was already started 5 minutes ago previously. To use it, launch the process with "bin/start-hbase autorestart". This option is not fully compatible with the existing "restart" command: if you ask for a restart on a server launched with autorestart, the server will restart but the next server instance won't be automatically restarted.
    • 0.96notable

    Description

      When a binary dies on a server, we don't try to restart it while it would be possible in most cases.

      We can have something as:
      loop
      start
      wait
      if cleanStop then exit
      if already stopped less than 5 minutes ago sleep 5 minute
      endloop

      This is simple for master & backup master, a little bit more complex for the region server as it can be stopped by a script or by the shutdown procedure.

      On a long long term it could allow a restart with exactly the same assignments.

      Attachments

        1. 5939.v4.patch
          7 kB
          Nicolas Liochon

        Issue Links

          Activity

            People

              nkeywal Nicolas Liochon
              nkeywal Nicolas Liochon
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: