Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-5939

Add an autorestart option in the start scripts

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 0.95.2
    • Fix Version/s: 0.95.0
    • Component/s: master, regionserver, scripts
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      When launched with autorestart, HBase processes will automatically restart if they are not properly terminated, either by a "stop" command or by a cluster stop. To ensure that it does not overload the system when the server itself is corrupted and the process cannot be restarted, the server sleeps for 5 minutes before restarting if it was already started 5 minutes ago previously. To use it, launch the process with "bin/start-hbase autorestart". This option is not fully compatible with the existing "restart" command: if you ask for a restart on a server launched with autorestart, the server will restart but the next server instance won't be automatically restarted.
      Show
      When launched with autorestart, HBase processes will automatically restart if they are not properly terminated, either by a "stop" command or by a cluster stop. To ensure that it does not overload the system when the server itself is corrupted and the process cannot be restarted, the server sleeps for 5 minutes before restarting if it was already started 5 minutes ago previously. To use it, launch the process with "bin/start-hbase autorestart". This option is not fully compatible with the existing "restart" command: if you ask for a restart on a server launched with autorestart, the server will restart but the next server instance won't be automatically restarted.
    • Tags:
      0.96notable

      Description

      When a binary dies on a server, we don't try to restart it while it would be possible in most cases.

      We can have something as:
      loop
      start
      wait
      if cleanStop then exit
      if already stopped less than 5 minutes ago sleep 5 minute
      endloop

      This is simple for master & backup master, a little bit more complex for the region server as it can be stopped by a script or by the shutdown procedure.

      On a long long term it could allow a restart with exactly the same assignments.

        Attachments

        1. 5939.v4.patch
          7 kB
          Nicolas Liochon

          Issue Links

            Activity

              People

              • Assignee:
                nkeywal Nicolas Liochon
                Reporter:
                nkeywal Nicolas Liochon
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: