Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
3.2.1
-
None
-
None
Description
I am encountering a performance problem in
CursorableLinkedList.containsAll() (which also affects
NodeCachingLinkedList). It appears in version 3.2.1 and also in
revision 1381642. I attached a test that exposes this problem and a
patch that fixes it. On my machine, for this test, the patch provides
a 182X speedup.
To run the test, just do:
$ java Test
The output for the un-patched version is:
Time is 5110
The output for the patched version is:
Time is 28
The problem is that "containsAll(Collection<?> coll)" iterates over
the elements of "coll" and calls "contains" on "this", which is very
slow, because "this" is a list.
This problem is different from COLLECTIONS-416. As established in the
COLLECTIONS-416 discussion, there the user was responsible for passing
the proper data structure as the parameter "coll" to the method.
However, for "AbstractLinkedList.containsAll()", the problem is not
related to the collection passed as the parameter. The problem will
always manifest for this method, irrespective of the parameter type.
I attached two patches. Both implement a linear time algorithm using
a HashSet. patchSmall.diff populates the HashSet eagerly, and
patchFull.diff populates the HashSet lazily. patchFull.diff is faster
than patchSmall.diff when containsAll() returns false after inspecting
only a few elements, though in most cases they are equally fast. I am
including patchSmall.diff just in case you prefer smaller code.
There is a potential issue in using a HashSet for comparisons: HashSet
uses "equals()" not "AbstractLinkedList.isEqualValue(Object value1,
Object value2)". Currently, this is not a problem, because
"AbstractLinkedList.isEqualValue" is implemented like this:
protected boolean isEqualValue(Object value1, Object value2) { return (value1 == value2 || (value1 == null ? false : value1.equals(value2))); }
which is the code used by HashSet to check equality, and no subclass
of AbstractLinkedList (there are only two subclasses:
"CursorableLinkedList" and "NodeCachingLinkedList") overrides
"isEqualValue".
If "isEqualValue" is indeed likely to be overridden in the future, we
can either (1) wrap the objects in the HashSet and define an
"equals()" that calls "isEqualValue" or (2) detect if the method is
really overridden through reflection with something like this:
boolean isOverriden = false; Class<?> curClass = this.getClass(); while (!curClass.equals(AbstractLinkedList.class)) { try { curClass.getDeclaredMethod("isEqualValue", Object.class, Object.class); isOverriden = true; break; } catch (Exception e) { curClass = curClass.getSuperclass(); } }
Attachments
Attachments
Issue Links
- is related to
-
COLLECTIONS-429 Performance problem in MultiValueMap
- Closed