[OLINGO-1625] The serializers have performance issues when Entities contain very large numbers of Properties - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: Version (Java) V4 5.0.0
Fix Version/s: (Java) V4 5.0.1
Component/s: odata4-server
Labels:
- json
- performance
- serialization
- xml

Flags:

Patch

Description

I've implemented an OData service that serves up some large datasets in a streaming fashion. Some of those datasets have large numbers of fields (over 1,000). When I requested one of them which was around 350M in size, it took way longer than expected.

I profiled the request in IntelliJ's profiler and found that over 75% of the CPU cycles were spent in String.equals() comparing column names in the serializers. This is because there is an O(N^2) issue that for every column selected (in my case all of them) it will iterate across the entire list of entity properties looking for the one with the same name.

I have already implemented a fix whereby before doing the property serialization, the serializer builds a hash map of property-name-to-property, making the resulting algorithm O(N) with the number of properties being serialized.

After profiling the change, again in IntelliJ's profiler, the String.equals() which was over 75% before, is now under 1%.

I will be creating a patch and attaching it momentarily.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

0001-OLINGO-1625-Fix-performance-problem-in-serializers-f.patch
04/Jun/24 19:25
14 kB
Ron Passerini

Activity

People

Assignee:: Unassigned

Reporter:: Ron Passerini

Votes:: 1 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 04/Jun/24 19:06

Updated:: 02/Jul/24 13:41