Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-3147

Balance tablets based on range hash buckets

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      When a user defines a schema that uses range + hash partitioning its is often the case that the tablets in the latest range, based on time or any semi-sequential data, are the only tablets that receive writes. Or even if not the latest, it is common for a single range to receive a burst of writes if backloading.

      This is so common, that the default Kudu balancing scheme should consider placing/rebalancing the tablets for the hash buckets within each range on as many servers as possible in order to support the maximum write throughput. In that case, `min(#buckets, #total-cluster-tservers)` tservers will be used to handle the writes if the cluster is perfectly balanced. Today, even if perfectly balanced, it is possible for all the hash buckets to be on a single tserver.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            raviBhanot Ravi Bhanot
            granthenke Grant Henke
            Votes:
            2 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment