Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-7934

Update RollingUpgrade rollback documentation: should use bootstrapstandby for standby NN

    Details

    • Hadoop Flags:
      Reviewed

      Description

      During Rolling upgrade rollback , standby namenode startup fails , while loading edits and when there is no local copy of edits created after upgrade ( which is already been removed by Active Namenode from journal manager and from Active's local).

      1. HDFS-7934.1.patch
        1 kB
        J.Andreina
      2. HDFS-7934.2.patch
        1 kB
        J.Andreina

        Activity

        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2115 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2115/)
        HDFS-7934. Update RollingUpgrade rollback documentation: should use bootstrapstandby for standby NN. Contributed by J. Andreina. (jing9: rev b172d03595d1591e7f542791224607d8c5fce3e2)

        • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        • hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2115 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2115/ ) HDFS-7934 . Update RollingUpgrade rollback documentation: should use bootstrapstandby for standby NN. Contributed by J. Andreina. (jing9: rev b172d03595d1591e7f542791224607d8c5fce3e2) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #166 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/166/)
        HDFS-7934. Update RollingUpgrade rollback documentation: should use bootstrapstandby for standby NN. Contributed by J. Andreina. (jing9: rev b172d03595d1591e7f542791224607d8c5fce3e2)

        • hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml
        • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #166 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/166/ ) HDFS-7934 . Update RollingUpgrade rollback documentation: should use bootstrapstandby for standby NN. Contributed by J. Andreina. (jing9: rev b172d03595d1591e7f542791224607d8c5fce3e2) hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Yarn-trunk #899 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/899/)
        HDFS-7934. Update RollingUpgrade rollback documentation: should use bootstrapstandby for standby NN. Contributed by J. Andreina. (jing9: rev b172d03595d1591e7f542791224607d8c5fce3e2)

        • hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml
        • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #899 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/899/ ) HDFS-7934 . Update RollingUpgrade rollback documentation: should use bootstrapstandby for standby NN. Contributed by J. Andreina. (jing9: rev b172d03595d1591e7f542791224607d8c5fce3e2) hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #165 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/165/)
        HDFS-7934. Update RollingUpgrade rollback documentation: should use bootstrapstandby for standby NN. Contributed by J. Andreina. (jing9: rev b172d03595d1591e7f542791224607d8c5fce3e2)

        • hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml
        • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #165 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/165/ ) HDFS-7934 . Update RollingUpgrade rollback documentation: should use bootstrapstandby for standby NN. Contributed by J. Andreina. (jing9: rev b172d03595d1591e7f542791224607d8c5fce3e2) hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #156 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/156/)
        HDFS-7934. Update RollingUpgrade rollback documentation: should use bootstrapstandby for standby NN. Contributed by J. Andreina. (jing9: rev b172d03595d1591e7f542791224607d8c5fce3e2)

        • hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml
        • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #156 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/156/ ) HDFS-7934 . Update RollingUpgrade rollback documentation: should use bootstrapstandby for standby NN. Contributed by J. Andreina. (jing9: rev b172d03595d1591e7f542791224607d8c5fce3e2) hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk #2097 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2097/)
        HDFS-7934. Update RollingUpgrade rollback documentation: should use bootstrapstandby for standby NN. Contributed by J. Andreina. (jing9: rev b172d03595d1591e7f542791224607d8c5fce3e2)

        • hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml
        • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2097 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2097/ ) HDFS-7934 . Update RollingUpgrade rollback documentation: should use bootstrapstandby for standby NN. Contributed by J. Andreina. (jing9: rev b172d03595d1591e7f542791224607d8c5fce3e2) hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-trunk-Commit #7592 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7592/)
        HDFS-7934. Update RollingUpgrade rollback documentation: should use bootstrapstandby for standby NN. Contributed by J. Andreina. (jing9: rev b172d03595d1591e7f542791224607d8c5fce3e2)

        • hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml
        • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #7592 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7592/ ) HDFS-7934 . Update RollingUpgrade rollback documentation: should use bootstrapstandby for standby NN. Contributed by J. Andreina. (jing9: rev b172d03595d1591e7f542791224607d8c5fce3e2) hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        Hide
        jingzhao Jing Zhao added a comment -

        Thanks for the fix, J.Andreina! The patch looks good to me. +1

        I've already committed this to trunk, branch-2, and branch-2.7.

        Show
        jingzhao Jing Zhao added a comment - Thanks for the fix, J.Andreina ! The patch looks good to me. +1 I've already committed this to trunk, branch-2, and branch-2.7.
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12725207/HDFS-7934.2.patch
        against trunk revision b5a0b24.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs.

        Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/10273//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/10273//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12725207/HDFS-7934.2.patch against trunk revision b5a0b24. +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/10273//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/10273//console This message is automatically generated.
        Hide
        andreina J.Andreina added a comment -

        Uploaded the patch for modifying the steps for rolling upgrade rollback .
        Please review.

        Show
        andreina J.Andreina added a comment - Uploaded the patch for modifying the steps for rolling upgrade rollback . Please review.
        Hide
        andreina J.Andreina added a comment -

        Thanks Jing Zhao and Vinayakumar B for your comments.

        I verified starting standby namenode with "-bootstrapstandby" flag , instead of starting with "-rollingUpgrade rollback" option (as mentioned in document).
        I did not face the HDFS-7934 and HDFS-7952 issues.

        I'll soon provide a patch for updating the document.

        Show
        andreina J.Andreina added a comment - Thanks Jing Zhao and Vinayakumar B for your comments. I verified starting standby namenode with "-bootstrapstandby" flag , instead of starting with "-rollingUpgrade rollback" option (as mentioned in document). I did not face the HDFS-7934 and HDFS-7952 issues. I'll soon provide a patch for updating the document.
        Hide
        jingzhao Jing Zhao added a comment - - edited

        For HA upgrade (not rollingUpgrade), currently we also require using "-bootstrapstandby" for the SBN after rolling back the ANN. Rolling rollback should follow similar path, otherwise we will hit issue like HDFS-7952, since rolling back JNs requires the corresponding NN to become the QJM writer.

        Show
        jingzhao Jing Zhao added a comment - - edited For HA upgrade (not rollingUpgrade), currently we also require using "-bootstrapstandby" for the SBN after rolling back the ANN. Rolling rollback should follow similar path, otherwise we will hit issue like HDFS-7952 , since rolling back JNs requires the corresponding NN to become the QJM writer.
        Hide
        jingzhao Jing Zhao added a comment -

        Should we run "-bootstrapStandby" for SBN after rolling back the ANN, instead of running the "rollingUpgrade rollback" again on the SBN?

        Show
        jingzhao Jing Zhao added a comment - Should we run "-bootstrapStandby" for SBN after rolling back the ANN, instead of running the "rollingUpgrade rollback" again on the SBN?
        Hide
        vinayrpet Vinayakumar B added a comment -

        Hi Tsz Wo Nicholas Sze you want to take a look at this?

        I feel this is blocker

        Show
        vinayrpet Vinayakumar B added a comment - Hi Tsz Wo Nicholas Sze you want to take a look at this? I feel this is blocker
        Hide
        vinayrpet Vinayakumar B added a comment -

        Hi Arpit Agarwal, You have any opinions on this?

        Show
        vinayrpet Vinayakumar B added a comment - Hi Arpit Agarwal , You have any opinions on this?
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12705338/HDFS-7934.1.patch
        against trunk revision 9d72f93.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs:

        org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager

        Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9958//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9958//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705338/HDFS-7934.1.patch against trunk revision 9d72f93. +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9958//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9958//console This message is automatically generated.
        Hide
        andreina J.Andreina added a comment -

        Hi Vinayakumar B ,

        Thanks for your comments. Uploaded an initial patch as per your suggestion and verified locally ( Standby Namenode Startup is successful ) .

        Please review the patch.

        Show
        andreina J.Andreina added a comment - Hi Vinayakumar B , Thanks for your comments. Uploaded an initial patch as per your suggestion and verified locally ( Standby Namenode Startup is successful ) . Please review the patch.
        Hide
        vinayrpet Vinayakumar B added a comment -

        I think the problematic area is below code block in Fsimage#loadFsImage(..)

              // For rollback in rolling upgrade, we need to set the toAtLeastTxId to
              // the txid right before the upgrade marker.  
              long toAtLeastTxId = editLog.isOpenForWrite() ? inspector
                  .getMaxSeenTxId() : 0;
              if (rollingRollback) {
                // note that the first image in imageFiles is the special checkpoint
                // for the rolling upgrade
                toAtLeastTxId = imageFiles.get(0).getCheckpointTxId() + 2;
              }

        In Case of rollingRollback, there is nothing read from edits streams. So setting toAtLeastTxId = imageFiles.get(0).getCheckpointTxId() + 2; is not required. removing this line will solve the problem IMO.

        Any thoughts?

        Show
        vinayrpet Vinayakumar B added a comment - I think the problematic area is below code block in Fsimage#loadFsImage(..) // For rollback in rolling upgrade, we need to set the toAtLeastTxId to // the txid right before the upgrade marker. long toAtLeastTxId = editLog.isOpenForWrite() ? inspector .getMaxSeenTxId() : 0; if (rollingRollback) { // note that the first image in imageFiles is the special checkpoint // for the rolling upgrade toAtLeastTxId = imageFiles.get(0).getCheckpointTxId() + 2; } In Case of rollingRollback, there is nothing read from edits streams. So setting toAtLeastTxId = imageFiles.get(0).getCheckpointTxId() + 2; is not required. removing this line will solve the problem IMO. Any thoughts?
        Hide
        andreina J.Andreina added a comment -

        Steps to Reproduce:
        =================

        Step 1: Start NN1 as active , NN2 as standby .
        Step 2: Perform "hdfs dfsadmin -rollingUpgrade prepare"
        Step 3: Start NN2 active and NN1 as standby with rolling upgrade started option.
        Step 4: DN also restarted in upgrade mode.

        NN2 active:
        -rw-r--r-- 1 Rex users 1048576 Mar 13 17:36 edits_inprogress_0000000000000000031
        -rw-r--r-- 1 Rex users     350 Mar 13 17:33 fsimage_0000000000000000000
        -rw-r--r-- 1 Rex users      62 Mar 13 17:33 fsimage_0000000000000000000.md5
        -rw-r--r-- 1 Rex users     622 Mar 13 17:36 fsimage_rollback_0000000000000000029
        -rw-r--r-- 1 Rex users      71 Mar 13 17:36 fsimage_rollback_0000000000000000029.md5
        -rw-r--r-- 1 Rex users       2 Mar 13 17:33 seen_txid
        -rw-r--r-- 1 Rex users     206 Mar 13 17:36 VERSION
        

        Step 5: NN2 active shutdown
        Step 6: write files

        NN1 active:
        -rw-r--r-- 1 Rex users    1817 Mar 13 17:35 edits_0000000000000000001-0000000000000000026
        -rw-r--r-- 1 Rex users      67 Mar 13 17:35 edits_0000000000000000027-0000000000000000029
        -rw-r--r-- 1 Rex users 1048576 Mar 13 17:35 edits_0000000000000000030-0000000000000000030
        -rw-r--r-- 1 Rex users 1048576 Mar 13 17:39 edits_inprogress_0000000000000000032
        -rw-r--r-- 1 Rex users     350 Mar 13 17:32 fsimage_0000000000000000000
        -rw-r--r-- 1 Rex users      62 Mar 13 17:32 fsimage_0000000000000000000.md5
        -rw-r--r-- 1 Rex users     622 Mar 13 17:36 fsimage_rollback_0000000000000000029
        -rw-r--r-- 1 Rex users      71 Mar 13 17:36 fsimage_rollback_0000000000000000029.md5
        -rw-r--r-- 1 Rex users       3 Mar 13 17:35 seen_txid
        -rw-r--r-- 1 Rex users     206 Mar 13 17:32 VERSION
        

        Step 7: bring down both NN
        Step 8: Start NN2 and NN1 with rolling upgrade rollback option.

        Issue:
        ======
        NN2 active started successfully but NN1 standby startup failed with following exception:

        15/03/13 17:41:30 ERROR namenode.NameNode: Failed to start namenode.
        java.io.IOException: Gap in transactions. Expected to be able to read up until at least txid 31 but unable to find any edit logs containing txid 31
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.checkForGaps(FSEditLog.java:1617)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1575)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:647)
        
        NN2 active:
        
        -rw-r--r-- 1 Rex users 1048576 Mar 13 17:36 edits_0000000000000000031-0000000000000000031.trash
        -rw-r--r-- 1 Rex users 1048576 Mar 13 17:40 edits_inprogress_0000000000000000030
        -rw-r--r-- 1 Rex users     350 Mar 13 17:33 fsimage_0000000000000000000
        -rw-r--r-- 1 Rex users      62 Mar 13 17:33 fsimage_0000000000000000000.md5
        -rw-r--r-- 1 Rex users     622 Mar 13 17:36 fsimage_0000000000000000029
        -rw-r--r-- 1 Rex users      62 Mar 13 17:40 fsimage_0000000000000000029.md5
        -rw-r--r-- 1 Rex users       2 Mar 13 17:33 seen_txid
        -rw-r--r-- 1 Rex users     206 Mar 13 17:40 VERSION
        
        NN1 standby:
        
        -rw-r--r-- 1 Rex users    1817 Mar 13 17:35 edits_0000000000000000001-0000000000000000026
        -rw-r--r-- 1 Rex users      67 Mar 13 17:35 edits_0000000000000000027-0000000000000000029
        -rw-r--r-- 1 Rex users 1048576 Mar 13 17:35 edits_0000000000000000030-0000000000000000030
        -rw-r--r-- 1 Rex users 1048576 Mar 13 17:39 edits_0000000000000000032-0000000000000000062
        -rw-r--r-- 1 Rex users     350 Mar 13 17:32 fsimage_0000000000000000000
        -rw-r--r-- 1 Rex users      62 Mar 13 17:32 fsimage_0000000000000000000.md5
        -rw-r--r-- 1 Rex users     622 Mar 13 17:36 fsimage_rollback_0000000000000000029
        -rw-r--r-- 1 Rex users      71 Mar 13 17:36 fsimage_rollback_0000000000000000029.md5
        -rw-r--r-- 1 Rex users       3 Mar 13 17:35 seen_txid
        -rw-r--r-- 1 Rex users     206 Mar 13 17:32 VERSION
        
        Show
        andreina J.Andreina added a comment - Steps to Reproduce: ================= Step 1: Start NN1 as active , NN2 as standby . Step 2: Perform "hdfs dfsadmin -rollingUpgrade prepare" Step 3: Start NN2 active and NN1 as standby with rolling upgrade started option. Step 4: DN also restarted in upgrade mode. NN2 active: -rw-r--r-- 1 Rex users 1048576 Mar 13 17:36 edits_inprogress_0000000000000000031 -rw-r--r-- 1 Rex users 350 Mar 13 17:33 fsimage_0000000000000000000 -rw-r--r-- 1 Rex users 62 Mar 13 17:33 fsimage_0000000000000000000.md5 -rw-r--r-- 1 Rex users 622 Mar 13 17:36 fsimage_rollback_0000000000000000029 -rw-r--r-- 1 Rex users 71 Mar 13 17:36 fsimage_rollback_0000000000000000029.md5 -rw-r--r-- 1 Rex users 2 Mar 13 17:33 seen_txid -rw-r--r-- 1 Rex users 206 Mar 13 17:36 VERSION Step 5: NN2 active shutdown Step 6: write files NN1 active: -rw-r--r-- 1 Rex users 1817 Mar 13 17:35 edits_0000000000000000001-0000000000000000026 -rw-r--r-- 1 Rex users 67 Mar 13 17:35 edits_0000000000000000027-0000000000000000029 -rw-r--r-- 1 Rex users 1048576 Mar 13 17:35 edits_0000000000000000030-0000000000000000030 -rw-r--r-- 1 Rex users 1048576 Mar 13 17:39 edits_inprogress_0000000000000000032 -rw-r--r-- 1 Rex users 350 Mar 13 17:32 fsimage_0000000000000000000 -rw-r--r-- 1 Rex users 62 Mar 13 17:32 fsimage_0000000000000000000.md5 -rw-r--r-- 1 Rex users 622 Mar 13 17:36 fsimage_rollback_0000000000000000029 -rw-r--r-- 1 Rex users 71 Mar 13 17:36 fsimage_rollback_0000000000000000029.md5 -rw-r--r-- 1 Rex users 3 Mar 13 17:35 seen_txid -rw-r--r-- 1 Rex users 206 Mar 13 17:32 VERSION Step 7: bring down both NN Step 8: Start NN2 and NN1 with rolling upgrade rollback option. Issue: ====== NN2 active started successfully but NN1 standby startup failed with following exception: 15/03/13 17:41:30 ERROR namenode.NameNode: Failed to start namenode. java.io.IOException: Gap in transactions. Expected to be able to read up until at least txid 31 but unable to find any edit logs containing txid 31 at org.apache.hadoop.hdfs.server.namenode.FSEditLog.checkForGaps(FSEditLog.java:1617) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1575) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:647) NN2 active: -rw-r--r-- 1 Rex users 1048576 Mar 13 17:36 edits_0000000000000000031-0000000000000000031.trash -rw-r--r-- 1 Rex users 1048576 Mar 13 17:40 edits_inprogress_0000000000000000030 -rw-r--r-- 1 Rex users 350 Mar 13 17:33 fsimage_0000000000000000000 -rw-r--r-- 1 Rex users 62 Mar 13 17:33 fsimage_0000000000000000000.md5 -rw-r--r-- 1 Rex users 622 Mar 13 17:36 fsimage_0000000000000000029 -rw-r--r-- 1 Rex users 62 Mar 13 17:40 fsimage_0000000000000000029.md5 -rw-r--r-- 1 Rex users 2 Mar 13 17:33 seen_txid -rw-r--r-- 1 Rex users 206 Mar 13 17:40 VERSION NN1 standby: -rw-r--r-- 1 Rex users 1817 Mar 13 17:35 edits_0000000000000000001-0000000000000000026 -rw-r--r-- 1 Rex users 67 Mar 13 17:35 edits_0000000000000000027-0000000000000000029 -rw-r--r-- 1 Rex users 1048576 Mar 13 17:35 edits_0000000000000000030-0000000000000000030 -rw-r--r-- 1 Rex users 1048576 Mar 13 17:39 edits_0000000000000000032-0000000000000000062 -rw-r--r-- 1 Rex users 350 Mar 13 17:32 fsimage_0000000000000000000 -rw-r--r-- 1 Rex users 62 Mar 13 17:32 fsimage_0000000000000000000.md5 -rw-r--r-- 1 Rex users 622 Mar 13 17:36 fsimage_rollback_0000000000000000029 -rw-r--r-- 1 Rex users 71 Mar 13 17:36 fsimage_rollback_0000000000000000029.md5 -rw-r--r-- 1 Rex users 3 Mar 13 17:35 seen_txid -rw-r--r-- 1 Rex users 206 Mar 13 17:32 VERSION

          People

          • Assignee:
            andreina J.Andreina
            Reporter:
            andreina J.Andreina
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development