Description
Working with our Namazu infrastructure, the first issue I hit when dialing up the faulty I/O injection rate is as follows:
2019-09-27 14:13:45 ERROR RaftStorageDirectory:336 - Failed to acquire lock on /home/vagrant/test_data/data0_slowed/64656d6f-5261-6674-4772-6f7570313233/in_use.lock. If this storage directory is mounted via NFS, ensure that the appropriate nfs lock services are running. java.io.IOException: Input/output error at java.io.RandomAccessFile.writeBytes(Native Method) at java.io.RandomAccessFile.write(RandomAccessFile.java:512) at org.apache.ratis.server.storage.RaftStorageDirectory.tryLock(RaftStorageDirectory.java:327) at org.apache.ratis.server.storage.RaftStorageDirectory.lock(RaftStorageDirectory.java:291) at org.apache.ratis.server.storage.RaftStorageDirectory.analyzeStorage(RaftStorageDirectory.java:264) at org.apache.ratis.server.storage.RaftStorage.analyzeAndRecoverStorage(RaftStorage.java:100) at org.apache.ratis.server.storage.RaftStorage.<init>(RaftStorage.java:63) at org.apache.ratis.server.impl.ServerState.<init>(ServerState.java:109) at org.apache.ratis.server.impl.RaftServerImpl.<init>(RaftServerImpl.java:110) at org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$2(RaftServerProxy.java:208) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Exception in thread "main" java.io.IOException: Input/output error at java.io.RandomAccessFile.writeBytes(Native Method) at java.io.RandomAccessFile.write(RandomAccessFile.java:512) at org.apache.ratis.server.storage.RaftStorageDirectory.tryLock(RaftStorageDirectory.java:327) at org.apache.ratis.server.storage.RaftStorageDirectory.lock(RaftStorageDirectory.java:291) at org.apache.ratis.server.storage.RaftStorageDirectory.analyzeStorage(RaftStorageDirectory.java:264) at org.apache.ratis.server.storage.RaftStorage.analyzeAndRecoverStorage(RaftStorage.java:100) at org.apache.ratis.server.storage.RaftStorage.<init>(RaftStorage.java:63) at org.apache.ratis.server.impl.ServerState.<init>(ServerState.java:109) at org.apache.ratis.server.impl.RaftServerImpl.<init>(RaftServerImpl.java:110) at org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$2(RaftServerProxy.java:208) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
It looks like the call chain does not re-try anywhere however.