[HDDS-1486] Ozone write fails in allocateBlock while writing >10MB files in multiple threads. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Cannot Reproduce
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
- intermittent

Description

15 node physical cluster. All Datanodes are up and running.
Client using 16 threads attempting to write 16000 x 10MB+ files using the FsStress utility
(https://github.com/arp7/FsPerfTest) fails with the following error.
This is an intermittent issue.

Server side exceptions

19/04/22 10:13:32 ERROR io.KeyOutputStream: Try to allocate more blocks for write failed, already allocated 0 blocks for this write.

19/04/18 14:33:23 WARN io.KeyOutputStream: Encountered exception java.io.IOException: Unexpected Storage Container Exception: java.util.concurrent.CompletionException: java.util.concurrent.CompletionException: org.apache.ratis.protocol.AlreadyClosedException: SlidingWindow$Client client-ADE7F801D3AD->RAFT is closed.. The last committed block length is 0, uncommitted data length is 10485760 retry count 0

Client side exceptions

FAILED org.apache.ratis.protocol.NotLeaderException: Server c6e64cc4-91e9-4b36-83e4-6d84a4e71b7f is not the leader (f44c1413-0847-45e3-982d-ac3aec15dffc:10.17.200.23:9858). Request must be sent to leader., logIndex=0, commits[c6e64cc4-91e9-4b36-83e4-6d84a4e71b7f:c131161, 287eccfb-8461-419a-8732-529d042380b3:c131161, f44c1413-0847-45e3-982d-ac3aec15dffc:c131161]

In the case of small key sizes (<1MB) and big key sizes with single thread, the above client side exceptions are infrequent. However, in the case of multithreaded 10MB+ size keys, the exceptions occur about 50% of the time and eventually cause write failures. I have attached one such failed pipeline logs.
Datanode Logs.zip

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

Datanode Logs.zip
02/May/19 17:22
901 kB
Aravindan Vijayan

Activity

People

Assignee:: Unassigned

Reporter:: Aravindan Vijayan

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 02/May/19 17:16

Updated:: 23/Jul/19 05:24

Resolved:: 23/Jul/19 05:24