Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-27551

Add config options to delay assignment to retain last region location

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.6.0, 3.0.0-alpha-4
    • None
    • None
    • Hide
      This change introduces a boolean hbase.master.scp.retain.assignment.force property with default value of false to the AssignmentManager.
      AssignmentManager already defines a hbase.master.scp.retain.assignment property, which enables AssignmentManager to prioritise the previous RegionServer the region was online when coming up with an assignment plan. This, however, does not guarantee the assignment retainment, in case the SCP triggers the TransitRegionStateProcedure (TRSP) before the given RegionServer is online.
      To forcibly "honour" the retainment, hbase.master.scp.retain.assignment.force property should be also set to true.
      Note that this could delay the region assignment until the given RegionServer reports itself as online to the master, and RITs may be reported on master UI or by HBCK.
      The amount of time the TRSP will try to open the region on the given RS is determined by hbase.master.scp.retain.assignment.force.retries (default to 600). Between each retry, the TRSP will sleep for an exponential factor of the value defined in hbase.master.scp.retain.assignment.force.wait-interval (default to 50) in millis.
      Show
      This change introduces a boolean hbase.master.scp.retain.assignment.force property with default value of false to the AssignmentManager. AssignmentManager already defines a hbase.master.scp.retain.assignment property, which enables AssignmentManager to prioritise the previous RegionServer the region was online when coming up with an assignment plan. This, however, does not guarantee the assignment retainment, in case the SCP triggers the TransitRegionStateProcedure (TRSP) before the given RegionServer is online. To forcibly "honour" the retainment, hbase.master.scp.retain.assignment.force property should be also set to true. Note that this could delay the region assignment until the given RegionServer reports itself as online to the master, and RITs may be reported on master UI or by HBCK. The amount of time the TRSP will try to open the region on the given RS is determined by hbase.master.scp.retain.assignment.force.retries (default to 600). Between each retry, the TRSP will sleep for an exponential factor of the value defined in hbase.master.scp.retain.assignment.force.wait-interval (default to 50) in millis.

    Description

      HBASE-27313 introduced the ability to persist the list of files cached in a given RS, but temporary RSes loss or restarts would cause regions to be eagerly reassigned on other RSes, making the persisted cache useless. For some use cases, such as when using ObjectStores based persistence, performance degradation caused by cache misses have a worse impact than temporary region unavailability.

      This proposes and additional config property (disabled by default) to forcibly wait the TRSP for a configurable time while checking for the previous RS holding region to get back online, before proceeding with the region assignment.

      Attachments

        Issue Links

          Activity

            People

              wchevreuil Wellington Chevreuil
              wchevreuil Wellington Chevreuil
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: