Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.3
-
None
Description
I wish to have load balanced job queues, like in ActiveMQ (copied and pasted):
"A queue implements load balancer semantics. A single message will be received by exactly one consumer. If there are no consumers available at the time the message is sent it will be kept until a consumer is available that can process the message. If a consumer receives a message and does not acknowledge it before closing then the message will be redelivered to another consumer. A queue can have many consumers with messages load balanced across the available consumers."
For example, suppose that I send tree jobs (j1, j2, j3) to a queue with 2 consumers (c1,c2).
The first job takes 10 seconds to complete, jobs 2 and 3 takes only 1 second.
Consumers are using 'client' ack, with credit:1,0 in order to receive only one job at the time.
The desired behaviour of consumers is:
[ 21:30:00 ][ c1 ] Got job 1
[ 21:30:00 ][ c2 ] Got job 2
[ 21:30:01 ][ c2 ] Ack job 2, now idle
[ 21:30:01 ][ c2 ] Got job 3
[ 21:30:02 ][ c2 ] Ack job 3, now idle
[ 21:30:10 ][ c1 ] Ack job 1, now idle
But currently, Apollo does:
[ 21:30:00 ][ c1 ] Got job 1
[ 21:30:00 ][ c2 ] Got job 2
[ 21:30:01 ][ c2 ] Ack job 2, now idle c2 is idle but does not gets job 3
[ 21:30:10 ][ c1 ] Ack job 1, now idle
[ 21:30:10 ][ c1 ] Got job 3
[ 21:30:11 ][ c1 ] Ack job 3, now idle
Seems that jobs are assigned in a round-robin fashion at the moment of being received by the broker.
If in this example I send 9 jobs of 1 second (instead of 2), consumer #1 gets 5 and consumer #2 gets the remaining 5, when the optimum would be to send the 9 fast jobs to consumer #2 while consumer #1 is processing the slow one. I know that using 'client' ack with credit:1,0 is suboptimal from broker perspective, but is the optimal way to balance jobs between workers.
Besides the underutilization of resources, the main problem is that if a consumer takes too much time to process a job (say, due to a DB lock) it may block the processing of a bunch of jobs already assigned to it.
...
BTW, impressive piece of work!!! I really impressed by the completeness of features, and how well Apollo behaved when I overloaded it with stress tests