[HDFS-14317] Standby does not trigger edit log rolling when in-progress edit log tailing is enabled - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: 2.9.0, 3.0.0
Fix Version/s: 3.0.4, 3.3.0, 3.2.1, 3.1.3
Component/s: None
Labels:
None

Hadoop Flags:

Reviewed

Description

The standby uses the following method to check if it is time to trigger edit log rolling on active.

  /**
   * @return true if the configured log roll period has elapsed.
   */
  private boolean tooLongSinceLastLoad() {
    return logRollPeriodMs >= 0 && 
      (monotonicNow() - lastLoadTimeMs) > logRollPeriodMs ;
  }

In doTailEdits(), lastLoadTimeMs is updated when standby is able to successfully tail any edits

      if (editsLoaded > 0) {
        lastLoadTimeMs = monotonicNow();
      }

The default configuration for dfs.ha.log-roll.period is 120 seconds and dfs.ha.tail-edits.period is 60 seconds. With in-progress edit log tailing enabled, tooLongSinceLastLoad() will almost never return true resulting in edit logs not rolled for a long time until this configuration dfs.namenode.edit.log.autoroll.multiplier.threshold takes effect.

[In our deployment, this resulted in in-progress edit logs getting deleted. The sequence of events is that standby was able to checkpoint twice while the in-progress edit log was growing on active. When the NNStorageRetentionManager decided to cleanup old checkpoints and edit logs, it cleaned up the in-progress edit log from active and QJM (as the txnid on in-progress edit log was older than the 2 most recent checkpoints) resulting in irrecoverably losing a few minutes worth of metadata].

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HDFS-14317.001.patch
27/Feb/19 22:36
9 kB
Ekanth Sethuramalingam
HDFS-14317.002.patch
01/Mar/19 06:41
8 kB
Ekanth Sethuramalingam
HDFS-14317.003.patch
01/Mar/19 20:25
9 kB
Ekanth Sethuramalingam
HDFS-14317.004.patch
01/Mar/19 22:01
11 kB
Ekanth Sethuramalingam

Issue Links

breaks

HDFS-14349 Edit log may be rolled more frequently than necessary with multiple Standby nodes

Open

relates to

HDFS-10519 Add a configuration option to enable in-progress edit log tailing

Resolved

Activity

People

Assignee:: Ekanth Sethuramalingam

Reporter:: Ekanth Sethuramalingam

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 26/Feb/19 05:34

Updated:: 31/Oct/19 00:21

Resolved:: 08/Mar/19 23:06