[RATIS-726] TimeoutScheduler holds on to the raftClientRequest till it times out even though request succeeds - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.5.0
Component/s: client
Labels:
None

Description

While running freon with 1 Node ratis, it was observed that the TimeoutScheduler holds on to the raftClientObject atleast for 3s(default for requestTimeoutDuration) even though the request is processed successfully and acknowledged back. This ends up creating a memory pressure causing ozone client to go OOM .

Heapdump analysis of ~~HDDS-2331~~ , it seems the timeout schduler holding onto total of 176 requests, (88 of writeChunk containing actual data and 88 putBlock requests) although data write is happening sequentially key by key in ozone.

Thanks adoroszlai for helping out discovering this.

cc ~ ljain msingh szetszwo jnpandey

Similar fix may be required in GrpCLogAppender as well it uses the same TimeoutScheduler.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

r726_20191022.patch
22/Oct/19 09:54
5 kB
Tsz-wo Sze

Issue Links

blocks

HDDS-2331 Client OOME due to buffer retention

Resolved

breaks

RATIS-732 TestRaftAsyncExceptionWithGrpc.testTimeoutException times out after 100s

Resolved

RATIS-733 TestRaftOutputStreamWithGrpc.testSimpleWrite times out after 30s

Resolved

RATIS-734 TestRaftServerWithGrpc.testRaftClientMetrics times out after 100s

Resolved

RATIS-704 Invoke sendAsync as soon as OrderedAsync is created

Resolved

Activity

People

Assignee:: Tsz-wo Sze

Reporter:: Shashikant Banerjee

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 21/Oct/19 15:57

Updated:: 24/Oct/19 18:12

Resolved:: 23/Oct/19 11:52