Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-14223

Meta WALs are not cleared if meta region was closed and RS aborts

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • None
    • None
    • None
    • Reviewed

    Description

      When an RS opens meta, and later closes it, the WAL(FSHlog) is not closed. The last WAL file just sits there in the RS WAL directory. If RS stops gracefully, the WAL file for meta is deleted. Otherwise if RS aborts, WAL for meta is not cleaned. It is also not split (which is correct) since master determines that the RS no longer hosts meta at the time of RS abort.

      From a cluster after running ITBLL with CM, I see a lot of -splitting directories left uncleaned:

      [root@os-enis-dal-test-jun-4-7 cluster-os]# sudo -u hdfs hadoop fs -ls /apps/hbase/data/WALs
      Found 31 items
      drwxr-xr-x   - hbase hadoop          0 2015-06-05 01:14 /apps/hbase/data/WALs/hregion-58203265
      drwxr-xr-x   - hbase hadoop          0 2015-06-05 07:54 /apps/hbase/data/WALs/os-enis-dal-test-jun-4-1.openstacklocal,16020,1433489308745-splitting
      drwxr-xr-x   - hbase hadoop          0 2015-06-05 09:28 /apps/hbase/data/WALs/os-enis-dal-test-jun-4-1.openstacklocal,16020,1433494382959-splitting
      drwxr-xr-x   - hbase hadoop          0 2015-06-05 10:01 /apps/hbase/data/WALs/os-enis-dal-test-jun-4-1.openstacklocal,16020,1433498252205-splitting
      ...
      

      The directories contain WALs from meta:

      [root@os-enis-dal-test-jun-4-7 cluster-os]# sudo -u hdfs hadoop fs -ls /apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285-splitting
      Found 2 items
      -rw-r--r--   3 hbase hadoop     201608 2015-06-05 03:15 /apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285-splitting/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285..meta.1433470511501.meta
      -rw-r--r--   3 hbase hadoop      44420 2015-06-05 04:36 /apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285-splitting/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285..meta.1433474111645.meta
      

      The RS hosted the meta region for some time:

      2015-06-05 03:14:28,692 INFO  [PostOpenDeployTasks:1588230740] zookeeper.MetaTableLocator: Setting hbase:meta region location in ZooKeeper as os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285
      ...
      2015-06-05 03:15:17,302 INFO  [RS_CLOSE_META-os-enis-dal-test-jun-4-5:16020-0] regionserver.HRegion: Closed hbase:meta,,1.1588230740
      

      In between, a WAL is created:

      2015-06-05 03:15:11,707 INFO  [RS_OPEN_META-os-enis-dal-test-jun-4-5:16020-0-MetaLogRoller] wal.FSHLog: Rolled WAL /apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285..meta.1433470511501.meta with entries=385, filesize=196.88 KB; new WAL /apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285..meta.1433474111645.meta
      

      When CM killed the region server later master did not see these WAL files:

      ./hbase-hbase-master-os-enis-dal-test-jun-4-3.log:2015-06-05 03:36:46,075 INFO  [MASTER_SERVER_OPERATIONS-os-enis-dal-test-jun-4-3:16000-0] master.SplitLogManager: started splitting 2 logs in [hdfs://os-enis-dal-test-jun-4-1.openstacklocal:8020/apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285-splitting] for [os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285]
      ./hbase-hbase-master-os-enis-dal-test-jun-4-3.log:2015-06-05 03:36:47,300 INFO  [main-EventThread] wal.WALSplitter: Archived processed log hdfs://os-enis-dal-test-jun-4-1.openstacklocal:8020/apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285-splitting/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285.default.1433475074436 to hdfs://os-enis-dal-test-jun-4-1.openstacklocal:8020/apps/hbase/data/oldWALs/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285.default.1433475074436
      ./hbase-hbase-master-os-enis-dal-test-jun-4-3.log:2015-06-05 03:36:50,497 INFO  [main-EventThread] wal.WALSplitter: Archived processed log hdfs://os-enis-dal-test-jun-4-1.openstacklocal:8020/apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285-splitting/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285.default.1433475175329 to hdfs://os-enis-dal-test-jun-4-1.openstacklocal:8020/apps/hbase/data/oldWALs/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285.default.1433475175329
      ./hbase-hbase-master-os-enis-dal-test-jun-4-3.log:2015-06-05 03:36:50,507 WARN  [MASTER_SERVER_OPERATIONS-os-enis-dal-test-jun-4-3:16000-0] master.SplitLogManager: returning success without actually splitting and deleting all the log files in path hdfs://os-enis-dal-test-jun-4-1.openstacklocal:8020/apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285-splitting
      ./hbase-hbase-master-os-enis-dal-test-jun-4-3.log:2015-06-05 03:36:50,508 INFO  [MASTER_SERVER_OPERATIONS-os-enis-dal-test-jun-4-3:16000-0] master.SplitLogManager: finished splitting (more than or equal to) 129135000 bytes in 2 log files in [hdfs://os-enis-dal-test-jun-4-1.openstacklocal:8020/apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285-splitting] in 4433ms
      

      Attachments

        1. HBASE-14223logs
          10 kB
          Samir Ahmic
        2. hbase-14223_v3-master.patch
          11 kB
          Enis Soztutar
        3. hbase-14223_v3-branch-1.patch
          24 kB
          Enis Soztutar
        4. hbase-14223_v3-branch-1.patch
          24 kB
          Enis Soztutar
        5. hbase-14223_v2-branch-1.patch
          26 kB
          Enis Soztutar
        6. hbase-14223_v1-branch-1.patch
          19 kB
          Enis Soztutar
        7. hbase-14223_v0.patch
          7 kB
          Enis Soztutar

        Issue Links

          Activity

            People

              Unassigned Unassigned
              enis Enis Soztutar
              Votes:
              1 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: