Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-7242

Use Runtime.exit() instead of Runtime.halt() upon HLog Sync failures

    XMLWordPrintableJSON

Details

    • Brainstorming
    • Status: Closed
    • Minor
    • Resolution: Not A Problem
    • None
    • None
    • None
    • None

    Description

      Hey Guys,
      Should we use Runtime.exit() instead of Runtime.halt(), when we fail a Hlog sync.

      The key difference is that Runtime.exit() is going to invoke the shutdown hooks; while Runtime.halt() does not.

      Why we might need this:
      We had a HDFS name node reboot today on one of our cells, and this caused multiple region servers to abort because they could not sync the Hlog.

      However, since multiple RS died simultaneously, this seemed like a co-related failure to the master. The master waits for the
      Znode to expire; but, this could take up to few minutes after RS death (this setting is in place so that we can withstand rack switch reboots, lasting a couple of minutes, without region movement).

      If the shutdown hooks are called, RS will close the ZK connection, causing a immediate Znode expiry. This might help cut down the unavailability as
      Regions can begin to get assigned faster.

      While, we do want to abort on Hlog failure, I do not think it would hurt giving the JVM a few seconds to shutdown gracefully. Please let me know
      If I am missing something.

      Thanks,
      -Amit

      Attachments

        Activity

          People

            Unassigned Unassigned
            amitanand Amitanand Aiyer
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: