Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-27476

Recovered replication may be blocked if enabled hbase.separate.oldlogdir.by.regionserver

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.0.0-alpha-3, 2.4.15
    • None
    • Replication
    • None

    Description

      In other PR, I got a failed UT

      [ERROR] Failures: 
      [ERROR] org.apache.hadoop.hbase.replication.TestReplicationKillMasterRSWithSeparateOldWALs.killOneMasterRS
      [ERROR]   Run 1: TestReplicationKillMasterRSWithSeparateOldWALs>TestReplicationKillMasterRS.killOneMasterRS:47->TestReplicationKillRS.loadTableAndKillRS:84 Waited too much time for queueFailover replication. Waited 61065ms.
      [ERROR]   Run 2: TestReplicationKillMasterRSWithSeparateOldWALs>TestReplicationKillMasterRS.killOneMasterRS:47->TestReplicationKillRS.loadTableAndKillRS:84 Waited too much time for queueFailover replication. Waited 58864ms.
      [ERROR]   Run 3: TestReplicationKillMasterRSWithSeparateOldWALs>TestReplicationKillMasterRS.killOneMasterRS:47->TestReplicationKillRS.loadTableAndKillRS:84 Waited too much time for queueFailover replication. Waited 57103ms. 

      This should be caused by a bug.

      If enabled hbase.separate.oldlogdir.by.regionserver, old wals will be moved into different dir by regionserver name like root/oldWALs/server1/wal1 . For recovered replication,  can't convert wal path(like root/oldWALs/wal1) into such paths, and throws FileNotFoundException.

      Attachments

        Activity

          People

            Ddupg Sun Xin
            Ddupg Sun Xin
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: