There seems to be a performance problem with the function retainAll of
CompositeCollection. This is analogous to https://issues.apache.org/jira/browse/COLLECTIONS-534 .
The following is the code of the function:
The performance problem occurs when the underlying collections in the current collection have a slow retainAll method. Whenever we're relying on Collection::retainAll, slow cases tend to occur when the parameter collection has a slow contains method.
The following test harness shows the performance degradation between
using a list and using a set as a parameter, across different collection sizes.
In the following table "Original" corresponds to the time taken by
the existing implementation of CompositeCollection::retainAll, "Repaired" to the time taken by the function invoked with the parameter converted into a set, and "Speed-up" to the speed-up obtained after the repair.
N Original(s) Repaired(s) Speed-up(X)
10 1.04 1.04 1.00
100 1.04 1.05 0.99
1000 1.06 1.06 1.00
10000 1.12 1.10 1.01
100000 9.34 1.15 8.12
500000 > 300 1.29 > 232.55
If it's unacceptable to convert the parameter into a set before calling
retainsAll, a solution would be to include a warning to the user in the documentation that the parameter should have a fast contains method when possible.