Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0-alpha-1, 2.3.0, 2.3.1, 2.3.3
    • 2.5.0
    • None
    • None

    Description

      this issues were discussed in PR#2113 as part of HBASE-24286, and it is a dependencies before we solve HBASE-24286.

      The changes were introduced in HBASE-24471 that partial meta was introduced and `partial` was defined as InitMetaProcedure did not succeed and INIT_META_ASSIGN_META was not completed.

        private static void writeFsLayout(Path rootDir, Configuration conf) throws IOException { 
         LOG.info("BOOTSTRAP: creating hbase:meta region"); 
         FileSystem fs = rootDir.getFileSystem(conf); 
         Path tableDir = CommonFSUtils.getTableDir(rootDir, TableName.META_TABLE_NAME); 
         if (fs.exists(tableDir) && !fs.delete(tableDir, true)) { 
           LOG.warn("Can not delete partial created meta table, continue..."); 
         }
      
      

      however, in the cloud use case where HFiles store on S3, WALs store on HDFS, ZK data are stored within the cluster, this partial meta becomes a block when cluster recreate on existing HFiles; Here, Zk data and WALs cannot be retained (HDFS was associated with cloud instance and was terminated together) when cluster recreates on the flushed HFiles, and existing meta are always considered as partial and deleted in `INIT_META_WRITE_FS_LAYOUT` during bootstrap. As a result, the recreate cluster starts with a empty meta table, either the cluster hangs during the master initialization (branch-2) because table states of namespace table cannot be assigned, or starts as a fresh cluster without any region assigned and table opens (may need HBCK to rebuild the meta).

      Potential solution suggested by Anoop

      In case of HM start and the bootstrap we create the ClusterID and write to FS and then to zk and then create the META table FS layout. So in a cluster recreate, we will see clusterID is there in FS and also the META FS layout but no clusterID in zk. Ya seems we can use this as indication for cluster recreate over existing data. In HM start, this is some thing we need to check at 1st itself and track. If this mode is true, later when (if) we do INIT_META_WRITE_FS_LAYOUT , we should not delete the META dir. As part of the Bootstrap when we write that proc to MasterProcWal, we can include this mode (boolean) info also. This is a protobuf message anyways. So even if this HM got killed and restarted (at a point where the clusterId was written to zk but the Meta FS layout part was not reached) we can use the info added as part of the bootstrap wal entry and make sure NOT to delete the meta dir.

      In this JIRA, we're going to fix the `partial` definition when we found cluster ID was stored in HFiles but ZK were deleted or fresh on cluster creates.

      Attachments

        Issue Links

          Activity

            People

              taklwu Tak-Lon (Stephen) Wu
              taklwu Tak-Lon (Stephen) Wu
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: