Recently we met a weird scenario where Procedure WAL roll fails as it is already created by someone else.
Later while going through the logs and code, observed that during Proc-WAL roll it failed to write the header. On failure file stream is just closed,
try { ProcedureWALFormat.writeHeader(newStream, header); startPos = newStream.getPos(); } catch (IOException ioe) { LOG.warn("Encountered exception writing header", ioe); newStream.close(); return false; }
Since we don't delete the corrupted file or increment the flushLogId, so on each retry it is trying to create the same flushLogId file. However Hmaster failover will resolve this issue, but we should handle it.
- links to