Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-1583

Start/Stop of large cluster untenable

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.20.0
    • None
    • None
    • Reviewed
    • No compaction on disable or shutdown of cluster. No disable on open or enable unless the region has references.

    Description

      Starting and stopping a loaded large cluster is way too flakey and takes too long. This is 0.19.x but same issues apply to TRUNK I'd say.

      At pset with our > 100 nodes carrying 6k regions:

      + shutdown takes way too long.... maybe ten minutes or so. We compact regions inline with shutdown. We should just go down. It doesn't seem like all regionservers go down everytime either.
      + startup is a mess with our assigning out regions an rebalancing at same time. By time that the compactions on open run, it can be near an hour before whole thing settles down and becomes useable

      Attachments

        1. 1583-v2-nocompactonopenclose.patch
          4 kB
          Michael Stack
        2. 1583-nocompactonclose.patch
          2 kB
          Michael Stack

        Activity

          People

            stack Michael Stack
            stack Michael Stack
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: