Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-11236

Last flushed sequence id is ignored by ServerManager

    Details

    • Type: Bug
    • Status: Reopened
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 2.0.0
    • Component/s: None
    • Labels:
      None

      Description

      I got lots of error messages like this:

      2014-05-22 08:58:59,793 DEBUG [RpcServer.handler=1,port=20020] master.ServerManager: RegionServer a2428.halxg.cloudera.com,20020,1400742071109 indicates a last flushed sequence id (numberOfStores=9, numberOfStorefiles=2, storefileUncompressedSizeMB=517, storefileSizeMB=517, compressionRatio=1.0000, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=0, writeRequestsCount=0, rootIndexSizeKB=34, totalStaticIndexSizeKB=381, totalStaticBloomSizeKB=0, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN) that is less than the previous last flushed sequence id (605446) for region IntegrationTestBigLinkedList, �A��*t�^FU�2��0,1400740489477.a44d3e309b5a7e29355f6faa0d3a4095. Ignoring.

      RegionLoad.toString doesn't print out the last flushed sequence id passed in. Why is it less than the previous one?

        Issue Links

          Activity

          Hide
          chia7712 Chia-Ping Tsai added a comment -

          HBASE-16721 may address this problem. According to the above comments, this issue had happened in the 1.1.2, 1.2, and hbase-1.2.0+cdh5.8.0.
          HBASE-16721 had be merged into 1.1.8+ and 1.2.4+, and cdh5.9.2. We should close this jira If there is no more victims.

          Show
          chia7712 Chia-Ping Tsai added a comment - HBASE-16721 may address this problem. According to the above comments, this issue had happened in the 1.1.2, 1.2, and hbase-1.2.0+cdh5.8.0. HBASE-16721 had be merged into 1.1.8+ and 1.2.4+, and cdh5.9.2 . We should close this jira If there is no more victims.
          Hide
          franluo frank luo added a comment -

          I am on version 1.1.2, and a data loss bug landed me here.

          What happened to me is that a data loss has been identified on region cea9c145a49489e2ebbc683c9bc0b545. Then grepping on the regionId, I found nothing but below:

          hbase-hbase-master-hqhd02nm01.pclc0.merkle.local.log.20:2017-06-30 05:47:20,237 WARN [B.priority.fifo.QRpcServer.handler=2,queue=0,port=16000] master.ServerManager: RegionServer hqhd02dt031.pclc0.merkle.local,16020,1498275002632 indicates a last flushed sequence id (20717979) that is less than the previous last flushed sequence id (24918589) for region CR_CDI_CHHEQPROD_HASH_RECORD,f8886,1492710130131.cea9c145a49489e2ebbc683c9bc0b545. Ignoring.
          hbase-hbase-master-hqhd02nm01.pclc0.merkle.local.log.20:2017-06-30 05:47:23,255 WARN [B.priority.fifo.QRpcServer.handler=19,queue=1,port=16000] master.ServerManager: RegionServer hqhd02dt031.pclc0.merkle.local,16020,1498275002632 indicates a last flushed sequence id (20717979) that is less than the previous last flushed sequence id (24918589) for region CR_CDI_CHHEQPROD_HASH_RECORD,f8886,1492710130131.cea9c145a49489e2ebbc683c9bc0b545. Ignoring.
          hbase-hbase-master-hqhd02nm01.pclc0.merkle.local.log.20:2017-06-30 05:47:26,273 WARN [B.priority.fifo.QRpcServer.handler=0,queue=0,port=16000] master.ServerManager: RegionServer hqhd02dt031.pclc0.merkle.local,16020,1498275002632 indicates a last flushed sequence id (20717979) that is less than the previous last flushed sequence id (24918589) for region CR_CDI_CHHEQPROD_HASH_RECORD,f8886,1492710130131.cea9c145a49489e2ebbc683c9bc0b545. Ignoring.
          hbase-hbase-master-hqhd02nm01.pclc0.merkle.local.log.20:2017-06-30 05:47:29,290 WARN [B.priority.fifo.QRpcServer.handler=16,queue=0,port=16000] master.ServerManager: RegionServer hqhd02dt031.pclc0.merkle.local,16020,1498275002632 indicates a last flushed sequence id (20717979) that is less than the previous last flushed sequence id (24918589) for region CR_CDI_CHHEQPROD_HASH_RECORD,f8886,1492710130131.cea9c145a49489e2ebbc683c9bc0b545. Ignoring.
          h

          Show
          franluo frank luo added a comment - I am on version 1.1.2, and a data loss bug landed me here. What happened to me is that a data loss has been identified on region cea9c145a49489e2ebbc683c9bc0b545. Then grepping on the regionId, I found nothing but below: hbase-hbase-master-hqhd02nm01.pclc0.merkle.local.log.20:2017-06-30 05:47:20,237 WARN [B.priority.fifo.QRpcServer.handler=2,queue=0,port=16000] master.ServerManager: RegionServer hqhd02dt031.pclc0.merkle.local,16020,1498275002632 indicates a last flushed sequence id (20717979) that is less than the previous last flushed sequence id (24918589) for region CR_CDI_CHHEQPROD_HASH_RECORD,f8886,1492710130131.cea9c145a49489e2ebbc683c9bc0b545. Ignoring. hbase-hbase-master-hqhd02nm01.pclc0.merkle.local.log.20:2017-06-30 05:47:23,255 WARN [B.priority.fifo.QRpcServer.handler=19,queue=1,port=16000] master.ServerManager: RegionServer hqhd02dt031.pclc0.merkle.local,16020,1498275002632 indicates a last flushed sequence id (20717979) that is less than the previous last flushed sequence id (24918589) for region CR_CDI_CHHEQPROD_HASH_RECORD,f8886,1492710130131.cea9c145a49489e2ebbc683c9bc0b545. Ignoring. hbase-hbase-master-hqhd02nm01.pclc0.merkle.local.log.20:2017-06-30 05:47:26,273 WARN [B.priority.fifo.QRpcServer.handler=0,queue=0,port=16000] master.ServerManager: RegionServer hqhd02dt031.pclc0.merkle.local,16020,1498275002632 indicates a last flushed sequence id (20717979) that is less than the previous last flushed sequence id (24918589) for region CR_CDI_CHHEQPROD_HASH_RECORD,f8886,1492710130131.cea9c145a49489e2ebbc683c9bc0b545. Ignoring. hbase-hbase-master-hqhd02nm01.pclc0.merkle.local.log.20:2017-06-30 05:47:29,290 WARN [B.priority.fifo.QRpcServer.handler=16,queue=0,port=16000] master.ServerManager: RegionServer hqhd02dt031.pclc0.merkle.local,16020,1498275002632 indicates a last flushed sequence id (20717979) that is less than the previous last flushed sequence id (24918589) for region CR_CDI_CHHEQPROD_HASH_RECORD,f8886,1492710130131.cea9c145a49489e2ebbc683c9bc0b545. Ignoring. h
          Hide
          sbarrier Sebastien Barrier added a comment -

          we are having the same issue with hbase-1.2.0+cdh5.8.0+160-1:

          2016-09-22 02:34:21,805 WARN [B.defaultRpcServer.handler=43,queue=3,port=60000] master.ServerManager: RegionServer dr2,60020,1470414843847 indicates a last flushed sequence id (6178551) that is less than the previous last flushed sequence id (9091225) for region user_entry_tags,?,1472495644435.f5cbb00e42bcfd353e2041818b69fd6f. Ignoring.

          Show
          sbarrier Sebastien Barrier added a comment - we are having the same issue with hbase-1.2.0+cdh5.8.0+160-1: 2016-09-22 02:34:21,805 WARN [B.defaultRpcServer.handler=43,queue=3,port=60000] master.ServerManager: RegionServer dr2,60020,1470414843847 indicates a last flushed sequence id (6178551) that is less than the previous last flushed sequence id (9091225) for region user_entry_tags,?,1472495644435.f5cbb00e42bcfd353e2041818b69fd6f. Ignoring.
          Show
          allan163 Allan Yang added a comment - ignore flushed sequence id can cause data loss, please ref to https://issues.apache.org/jira/browse/HBASE-16649?focusedCommentId=15502490&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15502490
          Hide
          pankaj2461 Pankaj Kumar added a comment -

          Sorry Stephen Yuan Jiang, couldn't find any useful information that time. Later I didn't see this log again.

          Show
          pankaj2461 Pankaj Kumar added a comment - Sorry Stephen Yuan Jiang , couldn't find any useful information that time. Later I didn't see this log again.
          Hide
          syuanjiang Stephen Yuan Jiang added a comment -

          Pankaj Kumar, could you provide more information on this? I just wonder whether you suffer data loss on this?

          Show
          syuanjiang Stephen Yuan Jiang added a comment - Pankaj Kumar , could you provide more information on this? I just wonder whether you suffer data loss on this?
          Hide
          pankaj2461 Pankaj Kumar added a comment -

          Duo Zhang Looks like root cause is different, we have HBASE-13811 in our version. I am also trying to figure out the relevant info from logs.

          Show
          pankaj2461 Pankaj Kumar added a comment - Duo Zhang Looks like root cause is different, we have HBASE-13811 in our version. I am also trying to figure out the relevant info from logs.
          Hide
          Apache9 Duo Zhang added a comment -

          I think this could happen without HBASE-13811 where stack fix a issue that we may report a wrong last flushed sequence id if a flush is aborted.

          Elliott Clark Any more informations? What's happened to the region before this log(reassignment or flush?)? Maybe there are other issues.

          Thanks.

          Show
          Apache9 Duo Zhang added a comment - I think this could happen without HBASE-13811 where stack fix a issue that we may report a wrong last flushed sequence id if a flush is aborted. Elliott Clark Any more informations? What's happened to the region before this log(reassignment or flush?)? Maybe there are other issues. Thanks.
          Hide
          pankaj2461 Pankaj Kumar added a comment -

          I also observed same log in product environment and regions were not opened.

          2015-08-31 19:04:25,448 | WARN  | PriorityRpcServer.handler=13,queue=1,port=21300 | RegionServer *.*.*.*,21302,1441017551749 indicates a last flushed sequence id (53128) that is less than the previous last flushed sequence id (53131) for region hbase:meta,,1 Ignoring. | org.apache.hadoop.hbase.master.ServerManager.updateLastFlushedSequenceIds(ServerManager.java:299)
          
          Show
          pankaj2461 Pankaj Kumar added a comment - I also observed same log in product environment and regions were not opened. 2015-08-31 19:04:25,448 | WARN | PriorityRpcServer.handler=13,queue=1,port=21300 | RegionServer *.*.*.*,21302,1441017551749 indicates a last flushed sequence id (53128) that is less than the previous last flushed sequence id (53131) for region hbase:meta,,1 Ignoring. | org.apache.hadoop.hbase.master.ServerManager.updateLastFlushedSequenceIds(ServerManager.java:299)
          Hide
          eclark Elliott Clark added a comment -

          I've seen this on 1.2 a decent ammount

          Show
          eclark Elliott Clark added a comment - I've seen this on 1.2 a decent ammount
          Hide
          jxiang Jimmy Xiang added a comment -

          Don't see it any more. Close it.

          Show
          jxiang Jimmy Xiang added a comment - Don't see it any more. Close it.

            People

            • Assignee:
              Unassigned
              Reporter:
              jxiang Jimmy Xiang
            • Votes:
              1 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated:

                Development