Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-6908

incorrect snapshot directory diff generated by snapshot deletion

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • None
    • 2.6.0
    • snapshots
    • None
    • Reviewed

    Description

      In the following scenario, delete snapshot could generate incorrect snapshot directory diff and corrupted fsimage, if you restart NN after that, you will get NullPointerException.
      1. create a directory and create a file under it
      2. take a snapshot
      3. create another file under that directory
      4. take second snapshot
      5. delete both files and the directory
      6. delete second snapshot
      incorrect directory diff will be generated.

      Restart NN will throw NPE

      java.lang.NullPointerException
      	at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.addToDeletedList(FSImageFormatPBSnapshot.java:246)
      	at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDeletedList(FSImageFormatPBSnapshot.java:265)
      	at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDirectoryDiffList(FSImageFormatPBSnapshot.java:328)
      	at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadSnapshotDiffSection(FSImageFormatPBSnapshot.java:192)
      	at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:254)
      	at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:168)
      	at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:208)
      	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:906)
      	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:892)
      	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:715)
      	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:653)
      	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:276)
      	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:882)
      	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:629)
      	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:498)
      	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:554)
      

      Attachments

        1. HDFS-6908.001.patch
          3 kB
          Juan Yu
        2. HDFS-6908.002.patch
          5 kB
          Juan Yu
        3. HDFS-6908.003.patch
          5 kB
          Juan Yu

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            jyu@cloudera.com Juan Yu
            jyu@cloudera.com Juan Yu
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment