Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-23589

FlushDescriptor contains non-matching family/output combinations

    XMLWordPrintableJSON

Details

    Description

      Flushing the active region creates the following files:

      2019-12-13 08:00:20,866 INFO org.apache.hadoop.hbase.regionserver.HStore: Added hdfs://replica-1:8020/hbase/data/default/IntegrationTestRegionReplicaReplication/20af2eb8929408f26d0b3b81e6b86d47/f2/dab4d1cc01e44773bad7bdb5d2e33b6c, entries=49128, sequenceid
      =70688, filesize=41.4 M
      2019-12-13 08:00:20,897 INFO org.apache.hadoop.hbase.regionserver.HStore: Added hdfs://replica-1:8020/hbase/data/default/IntegrationTestRegionReplicaReplication/20af2eb8929408f26d0b3b81e6b86d47/f3/ecc50f33085042f7bd2397253b896a3a, entries=50000, sequenceid
      =70688, filesize=42.3 M
      

      On the read replica region when we try to replay the flush we see the following:

      2019-12-13 08:00:21,279 WARN org.apache.hadoop.hbase.regionserver.HRegion: bfa9cdb0ab13d60b389df6621ab316d1 : At least one of the store files in flush: action: COMMIT_FLUSH table_name: "IntegrationTestRegionReplicaReplication" encoded_region_name: "20af2eb8929408f26d0b3b81e6b86d47" flush_sequence_number: 70688 store_flushes { family_name: "f2" store_home_dir: "f2" flush_output: "ecc50f33085042f7bd2397253b896a3a" } store_flushes { family_name: "f3" store_home_dir: "f3" flush_output: "dab4d1cc01e44773bad7bdb5d2e33b6c" } region_name: "IntegrationTestRegionReplicaReplication,cccccccc,1576252065847.20af2eb8929408f26d0b3b81e6b86d47." doesn't exist any more. Skip loading the file(s)
      java.io.FileNotFoundException: HFileLink locations=[hdfs://replica-1:8020/hbase/data/default/IntegrationTestRegionReplicaReplication/20af2eb8929408f26d0b3b81e6b86d47/f2/ecc50f33085042f7bd2397253b896a3a, hdfs://replica-1:8020/hbase/.tmp/data/default/IntegrationTestRegionReplicaReplication/20af2eb8929408f26d0b3b81e6b86d47/f2/ecc50f33085042f7bd2397253b896a3a, hdfs://replica-1:8020/hbase/mobdir/data/default/IntegrationTestRegionReplicaReplication/20af2eb8929408f26d0b3b81e6b86d47/f2/ecc50f33085042f7bd2397253b896a3a, hdfs://replica-1:8020/hbase/archive/data/default/IntegrationTestRegionReplicaReplication/20af2eb8929408f26d0b3b81e6b86d47/f2/ecc50f33085042f7bd2397253b896a3a]
              at org.apache.hadoop.hbase.io.FileLink.getFileStatus(FileLink.java:415)
              at org.apache.hadoop.hbase.util.ServerRegionReplicaUtil.getStoreFileInfo(ServerRegionReplicaUtil.java:135)
              at org.apache.hadoop.hbase.regionserver.HRegionFileSystem.getStoreFileInfo(HRegionFileSystem.java:311)
              at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.replayFlush(HStore.java:2414)
              at org.apache.hadoop.hbase.regionserver.HRegion.replayFlushInStores(HRegion.java:5310)
              at org.apache.hadoop.hbase.regionserver.HRegion.replayWALFlushCommitMarker(HRegion.java:5184)
              at org.apache.hadoop.hbase.regionserver.HRegion.replayWALFlushMarker(HRegion.java:5018)
              at org.apache.hadoop.hbase.regionserver.RSRpcServices.doReplayBatchOp(RSRpcServices.java:1143)
              at org.apache.hadoop.hbase.regionserver.RSRpcServices.replay(RSRpcServices.java:2229)
              at org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:29754)
              at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
              at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
              at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
              at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)
      

      As we can see the flush_outputs got mixed up.

       

      The issue is caused by HRegion.internalFlushCacheAndCommit. The code assumes "stores.values() and storeFlushCtxs have same order" which no longer seems to be true.

      Attachments

        Issue Links

          Activity

            People

              bszabolcs Szabolcs Bukros
              bszabolcs Szabolcs Bukros
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: