Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.7.0
-
None
-
None
Description
I got the following Severe unrecoverable error because of a transient error of file write and the server exit.
2021-11-01 10:55:41,215 [myid:4] - ERROR [SyncThread:4:ZooKeeperCriticalThread@49] - Severe unrecoverable error, from thread : SyncThread:4 java.io.IOException: Write error at java.base/java.io.FileOutputStream.writeBytes(Native Method) at java.base/java.io.FileOutputStream.write(FileOutputStream.java:354) at java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81) at java.base/java.io.BufferedOutputStream.flush(BufferedOutputStream.java:142) at org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:377) at org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:599) at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:657) at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:235) at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:169)
I think it is designed in https://issues.apache.org/jira/browse/ZOOKEEPER-2247.
But is it a good design that the server exit because of one commit fail?
I think it is better that we just let the commit fail or let the leader turn to follower, and keep the server running.