Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
QuorumJournalManager (HDFS-3077)
-
None
-
Reviewed
Description
Currently, if a JournalManager crashes and misses some segment of logs, and then comes back, it will be re-added as a valid part of the quorum on the next log roll. However, it will not have a complete history of log segments (i.e any individual JN may have gaps in its transaction history). This mirrors the behavior of the NameNode when there are multiple local directories specified.
However, it would be better if a background thread noticed these gaps and "filled them in" by grabbing the segments from other JournalNodes. This increases the resilience of the system when JournalNodes get reformatted or otherwise lose their local disk.
Attachments
Attachments
Issue Links
- blocks
-
HDFS-10659 Namenode crashes after Journalnode re-installation in an HA cluster due to missing paxos directory
- Resolved
- is blocked by
-
HDFS-11273 Move TransferFsImage#doGetUrl function to a Util class
- Resolved
- is related to
-
HDFS-12358 Handle IOException when transferring edit log to Journal current dir through JN sync
- Resolved
-
HDFS-12356 Unit test for JournalNode sync during Rolling Upgrade
- Resolved
-
HDFS-14942 Change Log Level to debug in JournalNodeSyncer#syncWithJournalAtIndex
- Resolved
- relates to
-
HDFS-11448 JN log segment syncing should support HA upgrade
- Resolved
-
HDFS-11866 JournalNode Sync should be off by default in hdfs-default.xml
- Resolved
-
HDFS-14140 JournalNodeSyncer authentication is failing in secure cluster
- Resolved
-
HDFS-12376 Enable JournalNode Sync by default
- Resolved