Cassandra
  1. Cassandra
  2. CASSANDRA-1037

Improve load balancing to take into account load in terms of operations

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Won't Fix
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Currently in cassandra, the load balancing takes into account disk space. When using an order-preserving partitioner, there can be hot spots in the various ranges of tokens in terms of operations. We would like to propose improving the load balancing so that it takes that the number of operations into account.

      There are two places where this can be handled:

      1. when the cluster decides on which nodes need to be balanced out.
      2. how to balance an individual node - where to split

      For number 1, the number of operations that a node performed could be factored in to how important it is to balance that node.

      For number 2, we are already using a midpoint in the node when trying to load balance with respect to space. We would propose adding a weight to the midpoint to lean towards splitting so that the operational load could be better handled, not just space.

        Activity

        Hide
        Jeremy Hanna added a comment -

        We'll work to help get this in for 0.7, though it doesn't necessarily need to involve api changes. It is a significant change to behavior though, so we'll try to get this in.

        This is a significant item to help with multi-tenant clusters.

        Show
        Jeremy Hanna added a comment - We'll work to help get this in for 0.7, though it doesn't necessarily need to involve api changes. It is a significant change to behavior though, so we'll try to get this in. This is a significant item to help with multi-tenant clusters.
        Hide
        Jonathan Ellis added a comment -

        i'm skeptical that you'd be able to balance fast enough to deal w/ hot spots significantly better than "balance by disk space", but i suppose it's worth a try

        Show
        Jonathan Ellis added a comment - i'm skeptical that you'd be able to balance fast enough to deal w/ hot spots significantly better than "balance by disk space", but i suppose it's worth a try
        Hide
        Jonathan Ellis added a comment -

        node-at-a-time loadbalance feels like a dead end.

        Show
        Jonathan Ellis added a comment - node-at-a-time loadbalance feels like a dead end.

          People

          • Assignee:
            Unassigned
            Reporter:
            Jeremy Hanna
          • Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development