Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-14396

Failed to load image from FSImageFile when downgrade from 3.x to 2.x

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.3.0, 3.2.1, 3.1.3
    • Component/s: rolling upgrades
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      During a rolling upgrade from Hadoop 2.x to 3.x, NameNode cannot persist erasure coding information, and therefore a user cannot start using erasure coding feature until finalize is done.

      Description

      After fixing HDFS-13596, try to downgrade from 3.x to 2.x. But namenode can't start because exception occurs. The message follows

      2019-01-23 17:22:18,730 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Failed to load image from FSImageFile(file=/data1/hadoopdata/hadoop-namenode/current/fsimage_0000000000000025310, cpktTxId=0000000000
      000025310)
      java.lang.NullPointerException
              at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:243)
              at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:179)
              at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:226)
              at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:885)
              at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:869)
              at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:742)
              at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:673)
              at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:290)
              at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:998)
              at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:700)
              at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:612)
              at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:672)
              at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:839)
              at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:823)
              at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1517)
              at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1583)
      2019-01-23 17:22:19,023 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage
      java.io.IOException: Failed to load FSImage file, see error(s) above for more info.
              at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:688)
              at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:290)
              at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:998)
              at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:700)
              at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:612)
      

      This issue occurs because 3.x namenode saves image with EC fields during upgrade
      Try to fix it

        Attachments

        1. HDFS-14396.001.patch
          5 kB
          Fei Hui
        2. HDFS-14396.002.patch
          2 kB
          Fei Hui

          Issue Links

            Activity

              People

              • Assignee:
                ferhui Fei Hui
                Reporter:
                ferhui Fei Hui
              • Votes:
                0 Vote for this issue
                Watchers:
                12 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: