In version 1.1, https://github.com/apache/kafka/blob/71b1e19fc60b5e1f9bba33025737ec2b7fb1c2aa/core/src/main/scala/kafka/security/auth/SimpleAclAuthorizer.scala#L110
the logic for checking acls is preparing a merged acl Set with
and then pass it as aclMatch's parameter.
We found scala's Set ++ operation is very slow for example in the case that the Set on right hand of ++ has more than 100 entries.
And the bad performance of ++ is due to iterating every entry of the Set on right hand of ++, in which the calculation of HashCode seems heavy.
The performance of 'authorize' is important as each request delivered to broker goes through the logic, that's the reason we can't leave it as-is although the change for this proposal seems trivial.
Here is the approach. We propose to solve this issue by introducing a new class 'AclSets' which takes multiple Sets as parameters and do 'find' against them one by one.
This approach avoid the Set ++ operation like following,
and thus outperforms a lot compared to old logic.
The benchmark result(we did the test with kafka version 1.1) shows notable difference under the condition:
1. set on left consists of 60 entries
2. set of right consists of 30 entries
3. search for absent entry (so that all entries are iterated)
Benchmark Results is as following.
Mode Cnt Score Error Units
ScalaSetConcatination.Set thrpt 3 281.974 ± 140.029 ops/ms
ScalaSetConcatination.AclSets thrpt 3 887.426 ± 40.261 ops/ms
As the upstream also use the similar ++ operation, https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/security/authorizer/AclAuthorizer.scala#L360
we think it's necessary to fix this issue.