Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-14500

NameNode StartupProgress continues to report edit log segments after the LOADING_EDITS phase is finished



    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.2.0, 2.9.2, 3.0.3, 2.8.5, 3.1.2
    • 2.10.0, 3.0.4, 3.3.0, 3.2.1, 3.1.3
    • namenode
    • None


      When testing out a cluster with the edit log tailing fast path feature enabled (HDFS-13150), an unrelated issue caused the NameNode to remain in safe mode for an extended period of time, preventing the NameNode from fully completing its startup sequence. We noticed that the Startup Progress web UI displayed many edit log segments (millions of them).

      I traced this problem back to StartupProgress. Within FSEditLogLoader, the loader continually tries to update the startup progress with a new Step any time that it loads edits. Per the Javadoc for StartupProgress, this should be a no-op once startup is completed:

       * After startup completes, the tracked data is frozen.  Any subsequent updates
       * or counter increments are no-ops.

      However, StartupProgress only implements that logic once the entire startup sequence has been completed. When FSEditLogLoader calls addStep(), it adds it into the LOADING_EDITS phase:

          StartupProgress prog = NameNode.getStartupProgress();
          Step step = createStartupProgressStep(edits);
          prog.beginStep(Phase.LOADING_EDITS, step);

      This phase, in our case, ended long before, so it is nonsensical to continue to add steps to it. I believe it is a bug that StartupProgress accepts such steps instead of ignoring them; once a phase is complete, it should no longer change.


        1. HDFS-14500.000.patch
          7 kB
          Erik Krogen
        2. HDFS-14500.001.patch
          8 kB
          Erik Krogen
        3. HDFS-14500-branch-2.001.patch
          7 kB
          Erik Krogen

        Issue Links



              xkrogen Erik Krogen
              xkrogen Erik Krogen
              1 Vote for this issue
              6 Start watching this issue