Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.6.0
    • Fix Version/s: None
    • Component/s: container
    • Labels:

      Description

      We used to have Throttling in Samza, but we removed it in favor of CGroups and MessageChooser (SAMZA-2). There is still a demand for this feature, though.

      Looking for thoughts and feedback.

      1. DESIGN-SAMZA-24-0.pdf
        91 kB
        Chris Riccomini
      2. DESIGN-SAMZA-24-0.md
        5 kB
        Chris Riccomini

        Issue Links

          Activity

          Hide
          Chris Riccomini added a comment -

          Migrating wiki-based SEP to .md/.pdf-based design doc as part of SAMZA-404.

          Show
          Chris Riccomini added a comment - Migrating wiki-based SEP to .md/.pdf-based design doc as part of SAMZA-404 .
          Hide
          Sriram Subramanian added a comment -

          Working on this.

          Show
          Sriram Subramanian added a comment - Working on this.
          Hide
          Sriram Subramanian added a comment -

          I will update the JIRA with a basic interface/implementation information.

          Show
          Sriram Subramanian added a comment - I will update the JIRA with a basic interface/implementation information.
          Hide
          Sriram Subramanian added a comment -

          I missed talking about the need to use Throttling with other MessageChoosers. This makes things very fuzzy. What does round robin scheduling with a throttling value per stream partition even mean? The expectation of the app would be that RoundRobin behaves as "RoundRobin". With throttling the sequence of events may no longer follow a round robin sequence. For this purpose, I suggest we keep Throttling based picker as one other way of picking and not let the usages to be mixed. Is there a use case that we can think of where I would need Priority/RoundRobin based scheduling with throttling? It just makes it more harder to reason things.

          Show
          Sriram Subramanian added a comment - I missed talking about the need to use Throttling with other MessageChoosers. This makes things very fuzzy. What does round robin scheduling with a throttling value per stream partition even mean? The expectation of the app would be that RoundRobin behaves as "RoundRobin". With throttling the sequence of events may no longer follow a round robin sequence. For this purpose, I suggest we keep Throttling based picker as one other way of picking and not let the usages to be mixed. Is there a use case that we can think of where I would need Priority/RoundRobin based scheduling with throttling? It just makes it more harder to reason things.
          Hide
          Chris Riccomini added a comment -

          One other thought: another argument for implementing throttling in the MessageChooser (whether it's metrics-based or not) is that we have the option to do partition-level throttling without blocking the whole container. We can opt to not pick a message from a given partition, rather than doing Thread.sleep. This has the effect of slowing down/throttling a single partition in the container without blocking processing in all other partitions in the container.

          Show
          Chris Riccomini added a comment - One other thought: another argument for implementing throttling in the MessageChooser (whether it's metrics-based or not) is that we have the option to do partition-level throttling without blocking the whole container. We can opt to not pick a message from a given partition, rather than doing Thread.sleep. This has the effect of slowing down/throttling a single partition in the container without blocking processing in all other partitions in the container.
          Hide
          Chris Riccomini added a comment -

          I agree that we should hide the full functionality of the throttler until we understand its implications better, if we go with the metrics-based throttling approach. This is the same conservative approach we took with the MessageChooser in SAMZA-2.

          Regarding the ThrottleMessageChooser, can you propose a straw man of what the class/interface composition would look like? Having a ThrottleMessageChooser means that we need some way to compose it with the other MessageChoosers, since you might want to use throttling with a RoundRobinChooser, a PriorityChooser, or whatever chooser you end up using.

          Show
          Chris Riccomini added a comment - I agree that we should hide the full functionality of the throttler until we understand its implications better, if we go with the metrics-based throttling approach. This is the same conservative approach we took with the MessageChooser in SAMZA-2 . Regarding the ThrottleMessageChooser, can you propose a straw man of what the class/interface composition would look like? Having a ThrottleMessageChooser means that we need some way to compose it with the other MessageChoosers, since you might want to use throttling with a RoundRobinChooser, a PriorityChooser, or whatever chooser you end up using.
          Hide
          Sriram Subramanian added a comment -

          Here is my thought on this. We have a mechanism to schedule messages using the MessageChooser. It makes sense to implement throttling as one form of scheduling. This can be done either by using the Throttler instance that the ThrottleMessageChooser uses to throttle each stream partition or by using the metrics value. I see the benefit of using the metrics approach to throttle arbitrary parameters but it is safer to not expose that feature now without understanding how it would behave for different parameters. In the future, if we really need to support arbitrary throttling, we could make the metric type a config value that is used when the Picker is a ThrottleMessageChooser.

          Show
          Sriram Subramanian added a comment - Here is my thought on this. We have a mechanism to schedule messages using the MessageChooser. It makes sense to implement throttling as one form of scheduling. This can be done either by using the Throttler instance that the ThrottleMessageChooser uses to throttle each stream partition or by using the metrics value. I see the benefit of using the metrics approach to throttle arbitrary parameters but it is safer to not expose that feature now without understanding how it would behave for different parameters. In the future, if we really need to support arbitrary throttling, we could make the metric type a config value that is used when the Picker is a ThrottleMessageChooser.

            People

            • Assignee:
              Sriram Subramanian
              Reporter:
              Chris Riccomini
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:

                Development