Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-18036

HBase 1.x : Data locality is not maintained after cluster restart or SSH

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 1.4.0, 1.3.1, 1.2.5, 1.1.10
    • 1.4.0, 1.3.2, 1.1.11, 1.2.7
    • Region Assignment
    • None
    • Reviewed

    Description

      After HBASE-2896 / HBASE-4402, we think data locality is maintained after cluster restart. However, we have seem some complains about data locality loss when cluster restart (eg. HBASE-17963).

      Examining the AssignmentManager#processDeadServersAndRegionsInTransition() code, for cluster start, I expected to hit the following code path:

          if (!failover) {
            // Fresh cluster startup.
            LOG.info("Clean cluster startup. Assigning user regions");
            assignAllUserRegions(allRegions);
          }
      

      where assignAllUserRegions would use retainAssignment() call in LoadBalancer; however, from master log, we usually hit the failover code path:

          // If we found user regions out on cluster, its a failover.
          if (failover) {
            LOG.info("Found regions out on cluster or in RIT; presuming failover");
            // Process list of dead servers and regions in RIT.
            // See HBASE-4580 for more information.
            processDeadServersAndRecoverLostRegions(deadServers);
          }
      

      where processDeadServersAndRecoverLostRegions() would put dead servers in SSH and SSH uses roundRobinAssignment() in LoadBalancer. That is why we would see loss locality more often than retaining locality during cluster restart.

      Note: the code I was looking at is close to branch-1 and branch-1.1.

      Attachments

        1. HBASE-18036.v2-branch-1.1.patch
          9 kB
          Stephen Yuan Jiang
        2. HBASE-18036.v1-branch-1.1.patch
          9 kB
          Stephen Yuan Jiang
        3. HBASE-18036.v0-branch-1.patch
          3 kB
          Stephen Yuan Jiang
        4. HBASE-18036.v0-branch-1.1.patch
          9 kB
          Stephen Yuan Jiang

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            syuanjiang Stephen Yuan Jiang
            syuanjiang Stephen Yuan Jiang
            Votes:
            4 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment