[KAFKA-15678] [Tiered Storage] Stall remote reads with long-spanning transactions - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 3.6.0
Fix Version/s: None
Component/s: Tiered-Storage
Labels:
- KIP-405

Description

I am facing an issue on the remote data path for uncommitted reads.

As mentioned in the original PR, if a transaction spans over a long sequence of segments, the time taken to retrieve the producer snapshots from the remote storage can, in the worst case, become redhibitory and block the reads if it consistently exceed the deadline of fetch requests (fetch.max.wait.ms).

Essentially, the method used to compute the uncommitted records to return with the fetch response has an asymptotic complexity proportional to the number of segments in the log. This is not a problem with local storage since the constant factor to traverse the producer snapshot files is small enough, but that is not the case with a remote storage which exhibits higher read latency.

An aggravating factor was the lock contention in the remote index cache which was mitigated by ~~KAFKA-15084~~ since then. But unfortunately, despite the improvements observed without the said contention, the algorithmic complexity of the current method used to compute uncommitted records can always defeat any optimisation made on the remote read path.

Maybe we could start thinking (if not already) about a different construct which would reduce that complexity to O(1) - i.e. to make the computation independent from the number of segments and irrespective of the spans of transactions.

Attachments

Issue Links

is a child of

KAFKA-16947 Kafka Tiered Storage V2

Open

is related to

KAFKA-16780 Txn consumer exerts pressure on remote storage when collecting aborted transactions

In Progress

Activity

People

Assignee:: Unassigned

Reporter:: Alexandre Dupriez

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 24/Oct/23 17:06

Updated:: 13/Jun/24 00:38