[KAFKA-3534] Deserialize on demand when default time extractor used - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: 0.10.0.0, 0.10.0.1, 0.10.1.0, 0.10.1.1, 0.10.2.0
Fix Version/s: None
Component/s: streams
Labels:
- performance

Description

When records are added to the RecordQueue, they are deserialized at that time in order to extract the timestamp. But for some data flows where large messages are consumed (particularly compressed messages), this can result in large spikes in memory as all messages must be deserialized prior to processing (and getting out of memory). An optimization might be to only require deserialization at this stage if a non-default timestamp extractor is being used.

Attachments

Issue Links

Is contained by

KAFKA-3514 Stream timestamp computation needs some further thoughts

Resolved

relates to

KAFKA-4785 Records from internal repartitioning topics should always use RecordMetadataTimestampExtractor

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Michael Coon

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 08/Apr/16 19:30

Updated:: 18/Oct/19 17:59

Resolved:: 18/Oct/19 17:59