Details
-
Improvement
-
Status: Resolved
-
Low
-
Resolution: Won't Fix
-
None
-
None
-
None
Description
Currently in cassandra, the load balancing takes into account disk space. When using an order-preserving partitioner, there can be hot spots in the various ranges of tokens in terms of operations. We would like to propose improving the load balancing so that it takes that the number of operations into account.
There are two places where this can be handled:
1. when the cluster decides on which nodes need to be balanced out.
2. how to balance an individual node - where to split
For number 1, the number of operations that a node performed could be factored in to how important it is to balance that node.
For number 2, we are already using a midpoint in the node when trying to load balance with respect to space. We would propose adding a weight to the midpoint to lean towards splitting so that the operational load could be better handled, not just space.