Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-26811

Secondary replica may be disabled for read incorrectly forever



    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0-alpha-2, 2.4.10
    • 2.5.0, 3.0.0-alpha-3, 2.4.12
    • read replicas
    • None


      For read replica, when I set hbase.region.replica.wait.for.primary.flush to false, and set TableDescriptorBuilder.setRegionMemStoreReplication to true explicitly at table level, the secondary replica would be disabled for read, reading on this replica region would throw :

      java.io.IOException:  The region's reads are disabled. Cannot serve the request
      	at org.apache.hadoop.hbase.regionserver.HRegion.checkReadsEnabled(HRegion.java:5187)
      	at org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:8279)

      Very strange, if I don't set TableDescriptorBuilder.setRegionMemStoreReplication to true explicitly (which default value is true), the secondary replica is normal.
      This problem is because when set hbase.region.replica.wait.for.primary.flush to false, the HRegionServer.startServices would not create the ExecutorType.RS_REGION_REPLICA_FLUSH_OPS for RegionReplicaFlushHandler at HRegionServer-level:

           if (ServerRegionReplicaUtil.isRegionReplicaWaitForPrimaryFlushEnabled(conf)) {
            final int regionReplicaFlushThreads = conf.getInt(
                "hbase.regionserver.region.replica.flusher.threads", conf.getInt(
                    "hbase.regionserver.executor.openregion.threads", 3));
            executorService.startExecutorService(executorService.new ExecutorConfig().setExecutorType(

      but when I set TableDescriptorBuilder.setRegionMemStoreReplication to true explicitly, it also set hbase.region.replica.wait.for.primary.flush to true at table-level(there is no public {{hbase.region.replica.wait.for.primary.flush} config for hbase user at table-level):

      public ModifyableTableDescriptor setRegionMemStoreReplication(boolean memstoreReplication) {
            setValue(REGION_MEMSTORE_REPLICATION_KEY, Boolean.toString(memstoreReplication));
            // If the memstore replication is setup, we do not have to wait for observing a flush event
            // from primary before starting to serve reads, because gaps from replication is not applicable

      So when the secondary replica region is open,HRegionServer.triggerFlushInPrimaryRegion is invoked for this region, because hbase.region.replica.wait.for.primary.flush to true at table-level, the line 2234 is skipped, secondary replica is disabled for read at line 2238, but there is no ExecutorType.RS_REGION_REPLICA_FLUSH_OPS for RegionReplicaFlushHandler at HRegionServer-level, so line 2243 would not schedule RegionReplicaFlushHandler, the secondary replica would be disabled for read.

      2227  private void triggerFlushInPrimaryRegion(final HRegion region) {
      2232      if (!ServerRegionReplicaUtil.isRegionReplicaReplicationEnabled(region.conf, tn) ||
      2233           !ServerRegionReplicaUtil.isRegionReplicaWaitForPrimaryFlushEnabled(region.conf)) {
      2234            region.setReadsEnabled(true);
      2235            return;
      2236       }
      2238      region.setReadsEnabled(false); // disable reads before marking the region as opened.
      2239      // RegionReplicaFlushHandler might reset this.
      2241      // Submit it to be handled by one of the handlers so that we do not block OpenRegionHandler
      2242     if (this.executorService != null) {
      2243        this.executorService.submit(new RegionReplicaFlushHandler(this, region));
      2244     } else {

      I think for above ModifyableTableDescriptor.setRegionMemStoreReplication, when set it to true, there is no reason to also set the hbase.region.replica.wait.for.primary.flush to true at table-level.

      This problem may be more serious on master because the new replication framework (HBASE-26233) does not enable the secondary replica read when receives the flush marker, that is to say, the secondary replica read would be disabled for read forever.
      But for branch-2, the secondary replica would be enabled for read when receives the flush marker.


        Issue Links



              comnetwork chenglei
              comnetwork chenglei
              0 Vote for this issue
              2 Start watching this issue