Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-3736

Failure in starting NN due to fsimage loading failure

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 2.0.0-alpha
    • None
    • ha, namenode
    • None

    Description

      Came across a situation as follows in our test environment.
      NNs running in HA mode.
      While uploading checkpoint, MD5 file renaming from tmp to actual file failed due to some reason which is unknown (non IO exception).
      At the same time at standby side, connection imeout occured.
      This lead to tmp MD5 file and original fsimage file (ckpt fsimage file was renamed successfully to original fsimage file) in the name dir of active NN.
      On NN restart it checks for MD5 file and since it is not found, startup is failing.

      Attachments

        Issue Links

          Activity

            People

              andrew.wang Andrew Wang
              suja suja s
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: