Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Fix Version/s: 0.7 beta 2
    • Component/s: Core
    • Labels:
      None

      Description

      Now that CASSANDRA-1035 has given Cassandra a request scheduler interface, it would be nice to have a weighted request scheduler.

        Activity

        Hide
        Jeremy Hanna added a comment -

        Will start with UserWeightedScheduler with some configuration options based on user.

        Show
        Jeremy Hanna added a comment - Will start with UserWeightedScheduler with some configuration options based on user.
        Hide
        Jeremy Hanna added a comment - - edited

        Taking a look at using Deficit Round Robin scheduling - http://en.wikipedia.org/wiki/Deficit_round_robin - as it has O(1) complexity for the number of flows that are managed.

        Show
        Jeremy Hanna added a comment - - edited Taking a look at using Deficit Round Robin scheduling - http://en.wikipedia.org/wiki/Deficit_round_robin - as it has O(1) complexity for the number of flows that are managed.
        Hide
        Jeremy Hanna added a comment -

        We'll probably just stick with a simple weighted round robin for an initial weighted scheduler. It will assume that all requests are equal and give weights to users' request based on configuration settings.

        As more is learned, we can implement more sophisticated schedulers that might take into account call type (read vs write, types of reads) as far as how long we assume the operation to take. We could just have defined values for each type of call and do a deficit calculation based on that for each queue.

        Show
        Jeremy Hanna added a comment - We'll probably just stick with a simple weighted round robin for an initial weighted scheduler. It will assume that all requests are equal and give weights to users' request based on configuration settings. As more is learned, we can implement more sophisticated schedulers that might take into account call type (read vs write, types of reads) as far as how long we assume the operation to take. We could just have defined values for each type of call and do a deficit calculation based on that for each queue.
        Hide
        Jeremy Hanna added a comment -

        Added simple keyspace-based weighting.

        Show
        Jeremy Hanna added a comment - Added simple keyspace-based weighting.
        Hide
        Stu Hood added a comment -
        • Could we attach the weights to the queue to minimize the work done in the scheduling loop?
        • In the innermost loop, you should probably break the weighted loop if the thread popped is ever null
        • The changes in ClientState are not necessary for this patch

        Thanks Jeremy!

        Show
        Stu Hood added a comment - Could we attach the weights to the queue to minimize the work done in the scheduling loop? In the innermost loop, you should probably break the weighted loop if the thread popped is ever null The changes in ClientState are not necessary for this patch Thanks Jeremy!
        Hide
        Jeremy Hanna added a comment - - edited

        Could we attach the weights to the queue to minimize the work done in the scheduling loop?

        I agree that we want to minimize instructions in the scheduler. However, the getWeight method only does a null check and potentially a hashmap lookup. If we were to attach weights to the queue, say make the queue have a Pair<String, Integer> key instead of just a String key, that might cause a lot more objects (Pairs and Integers) to be created. Also, it would still have to call getWeight. It would just call it in the queue(...) method instead of in the schedule() loop. So getWeight would still be called the same number of times - once per request thread.

        In the innermost loop, you should probably break the weighted loop if the thread popped is ever null

        true - will add that.

        The changes in ClientState are not necessary for this patch

        They're not necessary for this patch, but it cleans up that method and removes the necessity for SCHEDULE_ON_KEYSPACE by just doing a switch over the values in that enum. Changing the name to SchedulingValue just seemed more intuitive to me as I had been confusing schedulerId and schedulingId. It helped me get them straight more easily - maybe that's just me.

        Show
        Jeremy Hanna added a comment - - edited Could we attach the weights to the queue to minimize the work done in the scheduling loop? I agree that we want to minimize instructions in the scheduler. However, the getWeight method only does a null check and potentially a hashmap lookup. If we were to attach weights to the queue, say make the queue have a Pair<String, Integer> key instead of just a String key, that might cause a lot more objects (Pairs and Integers) to be created. Also, it would still have to call getWeight. It would just call it in the queue(...) method instead of in the schedule() loop. So getWeight would still be called the same number of times - once per request thread. In the innermost loop, you should probably break the weighted loop if the thread popped is ever null true - will add that. The changes in ClientState are not necessary for this patch They're not necessary for this patch, but it cleans up that method and removes the necessity for SCHEDULE_ON_KEYSPACE by just doing a switch over the values in that enum. Changing the name to SchedulingValue just seemed more intuitive to me as I had been confusing schedulerId and schedulingId. It helped me get them straight more easily - maybe that's just me.
        Hide
        Stu Hood added a comment -

        > that might cause a lot more objects (Pairs and Integers) to be created
        Queues are only created once at the moment, so it shouldn't cause significant overhead.

        Show
        Stu Hood added a comment - > that might cause a lot more objects (Pairs and Integers) to be created Queues are only created once at the moment, so it shouldn't cause significant overhead.
        Hide
        Jeremy Hanna added a comment - - edited

        I guess I just don't see a net gain. getWeight would need to be called for each request either in the schedule method or the queue method. Maybe I'm misunderstanding.

        Show
        Jeremy Hanna added a comment - - edited I guess I just don't see a net gain. getWeight would need to be called for each request either in the schedule method or the queue method. Maybe I'm misunderstanding.
        Hide
        Stu Hood added a comment -

        The value in the queue map could be the pair, instead of the key.

        Show
        Stu Hood added a comment - The value in the queue map could be the pair, instead of the key.
        Hide
        Jeremy Hanna added a comment -

        Attached weights to the queues so it wouldn't have to resolve weights in the schedule method (even for empty queues).

        Added an override for the default weight so that people can use something other than 1 for their default.

        Added a check for null in the schedule() loop to break out if there are no more threads.

        Show
        Jeremy Hanna added a comment - Attached weights to the queues so it wouldn't have to resolve weights in the schedule method (even for empty queues). Added an override for the default weight so that people can use something other than 1 for their default. Added a check for null in the schedule() loop to break out if there are no more threads.
        Hide
        Stu Hood added a comment -

        +1
        Thanks Jeremy.

        Show
        Stu Hood added a comment - +1 Thanks Jeremy.
        Hide
        Jeremy Hanna added a comment -

        no response from people about whether this will get into beta2

        Show
        Jeremy Hanna added a comment - no response from people about whether this will get into beta2
        Hide
        Jonathan Ellis added a comment -

        committed

        Show
        Jonathan Ellis added a comment - committed
        Hide
        Hudson added a comment -

        Integrated in Cassandra #545 (See https://hudson.apache.org/hudson/job/Cassandra/545/)
        Add weighted request scheduler. patch by Jeremy Hanna; reviewed by Stu Hood for CASSANDRA-1485

        Show
        Hudson added a comment - Integrated in Cassandra #545 (See https://hudson.apache.org/hudson/job/Cassandra/545/ ) Add weighted request scheduler. patch by Jeremy Hanna; reviewed by Stu Hood for CASSANDRA-1485

          People

          • Assignee:
            Jeremy Hanna
            Reporter:
            Jeremy Hanna
            Reviewer:
            Stu Hood
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development