Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
2.6.1
-
None
-
Reviewed
Description
In StandbyCheckpointer, if the legacy OIV directory is not properly created, or was deleted for some reason (e.g. mis-operation), all checkpoint ops will fall. Not only the ANN won't receive new fsimages, the JNs will get full with edit log files, and cause NN to crash.
// Save the legacy OIV image, if the output dir is defined. String outputDir = checkpointConf.getLegacyOivImageDir(); if (outputDir != null && !outputDir.isEmpty()) { img.saveLegacyOIVImage(namesystem, outputDir, canceler); }
It doesn't make sense to let such an unimportant part (saving OIV) abort all checkpoints and cause NN crash (and possibly lose data).
Attachments
Attachments
Issue Links
- relates to
-
HDFS-6293 Issues with OIV processing PB-based fsimages
- Closed
-
HDFS-11717 Add unit test for HDFS-11709 StandbyCheckpointer should handle non-existing legacyOivImageDir gracefully
- Resolved