Description
Cause rebalance and measure:
- overall throughput
- paused time
- (also look at the metrics from (https://issues.apache.org/jira/browse/KAFKA-8609)):
- accumulated rebalance time
Cluster/topic sizing:
-
- 10 instances
- 100 tasks (each instance gets 10 tasks)
- 1000 stores (each task gets 10 stores)
- standbys = [0 and 1]
Rolling bounce:
- with and without state loss
- shorter and faster than session timeout (shorter in particular should be interesting)
Expand (from 9 to 10)
Contract (from 10 to 9)
With and without saturation:
EOS:
- with and without
Topology:
- stateful
- windowed agg
Key Parameterizations:
1. control: no rebalances
2. rolling without state loss faster than session timeout
3. expand 9 to 10
4. contract 10 to 9