Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-14317

Standby does not trigger edit log rolling when in-progress edit log tailing is enabled

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.9.0, 3.0.0
    • Fix Version/s: 3.0.4, 3.3.0, 3.2.1, 3.1.3
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The standby uses the following method to check if it is time to trigger edit log rolling on active.

        /**
         * @return true if the configured log roll period has elapsed.
         */
        private boolean tooLongSinceLastLoad() {
          return logRollPeriodMs >= 0 && 
            (monotonicNow() - lastLoadTimeMs) > logRollPeriodMs ;
        }
      

      In doTailEdits(), lastLoadTimeMs is updated when standby is able to successfully tail any edits

            if (editsLoaded > 0) {
              lastLoadTimeMs = monotonicNow();
            }
      

      The default configuration for dfs.ha.log-roll.period is 120 seconds and dfs.ha.tail-edits.period is 60 seconds. With in-progress edit log tailing enabled, tooLongSinceLastLoad() will almost never return true resulting in edit logs not rolled for a long time until this configuration dfs.namenode.edit.log.autoroll.multiplier.threshold takes effect.

      [In our deployment, this resulted in in-progress edit logs getting deleted. The sequence of events is that standby was able to checkpoint twice while the in-progress edit log was growing on active. When the NNStorageRetentionManager decided to cleanup old checkpoints and edit logs, it cleaned up the in-progress edit log from active and QJM (as the txnid on in-progress edit log was older than the 2 most recent checkpoints) resulting in irrecoverably losing a few minutes worth of metadata].

        Attachments

        1. HDFS-14317.001.patch
          9 kB
          Ekanth Sethuramalingam
        2. HDFS-14317.002.patch
          8 kB
          Ekanth Sethuramalingam
        3. HDFS-14317.003.patch
          9 kB
          Ekanth Sethuramalingam
        4. HDFS-14317.004.patch
          11 kB
          Ekanth Sethuramalingam

          Issue Links

            Activity

              People

              • Assignee:
                ekanth Ekanth Sethuramalingam
                Reporter:
                ekanth Ekanth Sethuramalingam
              • Votes:
                0 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: