Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
Extracted from HDDS-3907.
Secure acceptance tests intermittently fail at test cases where data is being written.
https://github.com/elek/ozone-build-results/tree/master/2021/08/19/9810/acceptance-secure for logs.
Start freon testing | FAIL |
robot log.html
07:19:23.258 INFO Running command 'ozone freon randomkeys --num-of-volumes 5 --num-of-buckets 5 --num-of-keys 5 --num-of-threads 1 --replication-type RATIS --factor THREE --validate-writes 2>&1'.
07:24:23.225 FAIL Test timeout 5 minutes exceeded.
datanode_3 | 2021-08-19 05:20:09,598 [java.util.concurrent.ThreadPoolExecutor$Worker@5f5ccab7[State = -1, empty queue]] WARN server.GrpcLogAppender: 1c7f86b2-ded3-441b-9f20-84ba3ff60d2d@group-74FBCD15D899->25dd9de7-1caa-448d-a35a-2b29afced1cc-GrpcLogAppender: appendEntries Timeout, request=AppendEntriesRequest:cid=8,entriesCount=1,lastEntry=(t:3, i:0)
...
datanode_3 | 2021-08-19 05:23:56,577 [Thread-181] INFO client.GrpcClientProtocolService: Failed RaftClientRequest:client-14C4D4C86555->1c7f86b2-ded3-441b-9f20-84ba3ff60d2d@group-74FBCD15D899, cid=102, seq=0, Watch-ALL_COMMITTED(131), Message:<EMPTY>, reply=RaftClientReply:client-14C4D4C86555->1c7f86b2-ded3-441b-9f20-84ba3ff60d2d@group-74FBCD15D899, cid=102, FAILED org.apache.ratis.protocol.exceptions.NotReplicatedException: Request with call Id 102 and log index 131 is not yet replicated to ALL_COMMITTED, logIndex=131, commits[1c7f86b2-ded3-441b-9f20-84ba3ff60d2d:c132, 64230e6f-d613-4ced-8084-22c404c29d15:c132, 25dd9de7-1caa-448d-a35a-2b29afced1cc:c127]
datanode_2 | 2021-08-19 05:18:42,242 [Command processor thread] WARN commandhandler.CreatePipelineCommandHandler: Add group failed for 1c7f86b2-ded3-441b-9f20-84ba3ff60d2d{ip: 172.18.0.9, host: ozonesecure_datanode_3.ozonesecure_default, ports: [REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], networkLocation: /default-rack, certSerialId: null, persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 0} datanode_2 | java.io.IOException: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: Network closed for unknown reason
Attachments
Issue Links
- is caused by
-
HDDS-4730 Use separate Ratis admin and client ports
- Resolved
- links to