[COLLECTIONS-429] Performance problem in MultiValueMap - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 3.2.1
Fix Version/s: 4.0-alpha1, 4.0
Component/s: None
Labels:
None
Environment:

java 1.6.0_24
Ubuntu 11.10

Description

Hi,

I am encountering a performance problem in MultiValueMap. It appears
in version 3.2.1 and also in revision 1366088. I attached a test that
exposes this problem and a patch that fixes it. On my machine, for
this test, the patch provides a 1158X speedup.

To run the test, just do:

$ java Test

The output for the un-patched version is:
Time is 44040

The output for the patched version is:
Time is 38

The attached test shows that, for a "MultiValueMap multi" object, the
following operation is very slow:

multi.values().containsAll(toContain);

"MultiValueMap.values()" returns a "MultiValueMap.Values" object,
which inherits containsAll() from "java.util.AbstractCollection",
which has quadratic complexity.

I attached two patches. Both patches override containsAll() and
implement a linear time algorithm. patchSmall.diff populates a
HashSet eagerly, and patchFull.diff populates a HashSet lazily.
patchFull.diff is faster than patchSmall.diff when containsAll()
returns false after inspecting only a few elements, though in most
cases they are equally fast. I am including patchSmall.diff just in
case you prefer smaller code.

Note that this problem is different from ~~COLLECTIONS-416~~. As
established in the ~~COLLECTIONS-416~~ discussion, there the user was
responsible for using the proper data structures as argument for the
method.

For "MultiValueMap.values()", the problem is not related to the
collection passed as parameter. The problem will always manifest for
this method, irrespective of the parameter type. I attached a test
(TestDifferentParameter.java) that shows that, even with a HashSet
parameter, the current problem still manifests (which did not happen
for ~~COLLECTIONS-416~~).

This problem also exists for the two other "Values" classes
(AbstractHashedMap.Values, StaticBucketMap.Values). I attached tests
and patches for these classes as well. If you want me to file
separate reports, just let me know.

Is this truly a performance problem, or am I misunderstanding the
intended behavior? If so, can you please confirm that the patches are
correct?

Thanks,

Adrian

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

patchFull_AbstractHashedMap.diff
26/Jul/12 19:04
1 kB
Adrian Nistor
patchFull_StaticBucketMap.diff
26/Jul/12 19:04
1 kB
Adrian Nistor
patchFull.diff
26/Jul/12 19:04
1 kB
Adrian Nistor
patchSmall_AbstractHashedMap.diff
26/Jul/12 19:04
0.9 kB
Adrian Nistor
patchSmall_StaticBucketMap.diff
26/Jul/12 19:04
0.8 kB
Adrian Nistor
patchSmall.diff
26/Jul/12 19:04
0.8 kB
Adrian Nistor
Test_AbstractHashedMap.java
26/Jul/12 19:04
0.8 kB
Adrian Nistor
Test_StaticBucketMap.java
26/Jul/12 19:04
0.8 kB
Adrian Nistor
Test.java
26/Jul/12 19:04
0.8 kB
Adrian Nistor
TestDifferentParameter.java
26/Jul/12 19:04
0.8 kB
Adrian Nistor

Issue Links

relates to

COLLECTIONS-434 performance problem in CursorableLinkedList.containsAll()

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Adrian Nistor

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 26/Jul/12 19:03

Updated:: 09/Nov/14 14:31

Resolved:: 17/Apr/13 19:36