Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-23035

Retain region to the last RegionServer make the failover slower

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0-alpha-1, 2.3.0, 2.2.1, 2.1.6
    • 3.0.0-alpha-1, 2.3.0, 2.1.7, 2.2.2
    • amv2
    • None
    • Hide
      Since 2.0.0,when one regionserver crashed and back online again, AssignmentManager will retain the region locations and try assign the regions to this regionserver(same host:port with the crashed one) again. But for 1.x.x, the behavior is round-robin assignment for the regions belong to the crashed regionserver. This jira change the "retain" assignment to round-robin assignment, which is same with 1.x.x version. This change will make the failover faster and improve availability.
      Show
      Since 2.0.0,when one regionserver crashed and back online again, AssignmentManager will retain the region locations and try assign the regions to this regionserver(same host:port with the crashed one) again. But for 1.x.x, the behavior is round-robin assignment for the regions belong to the crashed regionserver. This jira change the "retain" assignment to round-robin assignment, which is same with 1.x.x version. This change will make the failover faster and improve availability.

    Description

      Now if one RS crashed, the regions will try to use the old location for the region deploy. But one RS only have 3 threads to open region by default. If a RS have hundreds of regions, the failover is very slower. Assign to same RS may have good locality if the Datanode is deploied on same host. But slower failover make the availability worse. And the locality is not big deal when deploy HBase on cloud.

      This was introduced by HBASE-18946.

      Attachments

        There are no Sub-Tasks for this issue.

        Activity

          People

            zghao Guanghao Zhang
            zghao Guanghao Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: