Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-5535 Umbrella jira for improved HDFS rolling upgrades
  3. HDFS-6019

Standby NN might not checkpoint when processing the rolling upgrade marker

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • datanode, ha, hdfs-client, namenode
    • None
    • Reviewed

    Description

      FsEditlogLoader will call FSNameSystem#triggerRollbackCheckpoint() when processing the rollback marker, which looks like the following:

      void triggerRollbackCheckpoint() {
        if (standbyCheckpointer != null) {
          standbyCheckpointer.triggerRollbackCheckpoint();
        }
      }
      

      There is a race condition where standbyCheckpointer can be null, because in the constructor of the NameNode, the initialize() method eventually starts the edit log tailer, but the standby checkpointer is created in HAState#enterState(). Therefore, the checkpointer might not be able to checkpoint when it sees the marker.

      Attachments

        1. HDFS-6019.000.patch
          7 kB
          Haohui Mai
        2. HDFS-6019.001.patch
          6 kB
          Jing Zhao

        Activity

          People

            wheat9 Haohui Mai
            wheat9 Haohui Mai
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: