Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-15036

Active NameNode should not silently fail the image transfer

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.10.0
    • 3.3.0, 3.1.4, 3.2.2, 2.10.1
    • namenode
    • None

    Description

      Image transfer from Standby NameNode to Active silently fails on Active, without any logging and not notifying the receiver side.

      Attachments

        1. HDFS-15036.001.patch
          3 kB
          Chen Liang
        2. HDFS-15036.002.patch
          5 kB
          Chen Liang
        3. HDFS-15036.003.patch
          5 kB
          Chen Liang

        Issue Links

          Activity

            This can happen during checkpointing or preparing for a rolling upgrade.
            We observed it during rolling upgrade, when Standby was reporting: "Rollback image has been created. Proceed to upgrade daemons." While Active still reported " Rollback image has not been created."

            In the logs for ANN I see that it started receiving the image:

             
            2019-12-05 23:14:56,328 INFO org.apache.hadoop.hdfs.server.namenode.ImageServlet: ImageServlet allowing checkpointer: hdfs/active.namenode.com 
            

            But ANN did not print anything related to the image transfer afterwards. And the transferred image is missing in its storage directory.
            The ANN log message comes from isValidRequestor() called by ImageServlet.doPut().

            SBN log indicates that the image was fully and successfully transferred to ANN

             
            2019-12-05 23:22:29,526 INFO org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Sending fileName: /hdfs-storage-dir/current/fsimage_rollback_0000000000773999609, fileSize: 1889021016. Sent total: 1889021016 bytes. Size of last segment intended to send: -1 bytes.
            

            The SBN log message comes from TransferFsImage.copyFileToStream().

            Looking at the code in ImageServlet.doPut() I see that in one of the methods it calls Util.receiveFile() if an Exception is thrown inside the while-loop performing reading from the input (socket) stream and writing to the output (image file) stream, then it will go through a series of finalized sections without catching the exception and logging it or reporting the error to the sender.

            We should:

            1. Catch and log any exceptions occurring there
            2. Notify SBN about the error, so that it could retry the transfer
            shv Konstantin Shvachko added a comment - This can happen during checkpointing or preparing for a rolling upgrade. We observed it during rolling upgrade, when Standby was reporting: "Rollback image has been created. Proceed to upgrade daemons." While Active still reported " Rollback image has not been created." In the logs for ANN I see that it started receiving the image: 2019-12-05 23:14:56,328 INFO org.apache.hadoop.hdfs.server.namenode.ImageServlet: ImageServlet allowing checkpointer: hdfs/active.namenode.com But ANN did not print anything related to the image transfer afterwards. And the transferred image is missing in its storage directory. The ANN log message comes from isValidRequestor() called by ImageServlet.doPut() . SBN log indicates that the image was fully and successfully transferred to ANN 2019-12-05 23:22:29,526 INFO org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Sending fileName: /hdfs-storage-dir/current/fsimage_rollback_0000000000773999609, fileSize: 1889021016. Sent total: 1889021016 bytes. Size of last segment intended to send: -1 bytes. The SBN log message comes from TransferFsImage.copyFileToStream() . Looking at the code in ImageServlet.doPut() I see that in one of the methods it calls Util.receiveFile() if an Exception is thrown inside the while-loop performing reading from the input (socket) stream and writing to the output (image file) stream, then it will go through a series of finalized sections without catching the exception and logging it or reporting the error to the sender. We should: Catch and log any exceptions occurring there Notify SBN about the error, so that it could retry the transfer
            vagarychen Chen Liang added a comment - - edited

            Spent some time debugging this issue, I think I found the cause of the issue.

            In HDFS-12979, we introduced a logic that, if a image being uploaded is not too far ahead of the previous image, this image upload request is rejected. This is to prevent the scenario when there are multiple SbNs, all SbNs upload images to ANN too frequently. This is considered as correct behavior, so there is no logging indication of any error or anything here (the being "silent" part). Both ANN and SbN simply ignore and proceed.

            But now it appears that, a side effect of this change, is that during RU, the rollback image also has to go through this check, and it could also be rejected. If this happens, SbN proceeds assuming upload is done, while ANN proceeds with still not receiving the rollback image. The upload silently failed in this case.

            The check logic that rejects the upload is in ImageServlet. In my earlier test, I just commented out the whole block below and the issue seems gone. But I think the fix is probably just adding a new check to ensure this rejection only applies to regular image upload, not rollback image, like the newly added line in the line in the follow code snippet. But I haven't actually tested changing it this way.:

                          if (checkRecentImageEnable &&
                              NameNodeFile.IMAGE.equals(parsedParams.getNameNodeFile()) && // <--- this should fix the issue, as NameNodeFile.IMAGE_ROLLBACK should bypass this
                              timeDelta < checkpointPeriod &&
                              txid - lastCheckpointTxid < checkpointTxnCount) {
                            // only when at least one of two conditions are met we accept
                            // a new fsImage
                            // 1. most recent image's txid is too far behind
                            // 2. last checkpoint time was too old
                            response.sendError(HttpServletResponse.SC_CONFLICT,
                                "Most recent checkpoint is neither too far behind in "
                                    + "txid, nor too old. New txnid cnt is "
                                    + (txid - lastCheckpointTxid)
                                    + ", expecting at least " + checkpointTxnCount
                                    + " unless too long since last upload.");
                            return null;
                          }
            
            vagarychen Chen Liang added a comment - - edited Spent some time debugging this issue, I think I found the cause of the issue. In HDFS-12979 , we introduced a logic that, if a image being uploaded is not too far ahead of the previous image, this image upload request is rejected. This is to prevent the scenario when there are multiple SbNs, all SbNs upload images to ANN too frequently. This is considered as correct behavior, so there is no logging indication of any error or anything here (the being "silent" part). Both ANN and SbN simply ignore and proceed. But now it appears that, a side effect of this change, is that during RU, the rollback image also has to go through this check, and it could also be rejected. If this happens, SbN proceeds assuming upload is done, while ANN proceeds with still not receiving the rollback image. The upload silently failed in this case. The check logic that rejects the upload is in ImageServlet . In my earlier test, I just commented out the whole block below and the issue seems gone. But I think the fix is probably just adding a new check to ensure this rejection only applies to regular image upload, not rollback image, like the newly added line in the line in the follow code snippet. But I haven't actually tested changing it this way.: if (checkRecentImageEnable && NameNodeFile.IMAGE.equals(parsedParams.getNameNodeFile()) && // <--- this should fix the issue, as NameNodeFile.IMAGE_ROLLBACK should bypass this timeDelta < checkpointPeriod && txid - lastCheckpointTxid < checkpointTxnCount) { // only when at least one of two conditions are met we accept // a new fsImage // 1. most recent image's txid is too far behind // 2. last checkpoint time was too old response.sendError(HttpServletResponse.SC_CONFLICT, "Most recent checkpoint is neither too far behind in " + "txid, nor too old. New txnid cnt is " + (txid - lastCheckpointTxid) + ", expecting at least " + checkpointTxnCount + " unless too long since last upload." ); return null ; }
            csun Chao Sun added a comment -

            vagarychen sorry for grabbing this JIRA too soon Since you have done much study on this, do you want to take this JIRA instead?

            csun Chao Sun added a comment - vagarychen sorry for grabbing this JIRA too soon Since you have done much study on this, do you want to take this JIRA instead?
            vagarychen Chen Liang added a comment -

            csun np, sure, thanks for asking  . Assigning to myself then.

            vagarychen Chen Liang added a comment - csun  np, sure, thanks for asking   . Assigning to myself then.

            Good investigation and findings vagarychen.

            1. Could you add a comment explaining that ImageServlet should not reject images other than checkpoints.
            2. I am still concerned about the "silent" part. Should we add some logging, so that next time we could see what happened on both nodes.
            shv Konstantin Shvachko added a comment - Good investigation and findings vagarychen . Could you add a comment explaining that ImageServlet should not reject images other than checkpoints. I am still concerned about the "silent" part. Should we add some logging, so that next time we could see what happened on both nodes.
            hadoopqa Hadoop QA added a comment -
            -1 overall



            Vote Subsystem Runtime Comment
            0 reexec 0m 47s Docker mode activated.
                  Prechecks
            +1 @author 0m 0s The patch does not contain any @author tags.
            +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
                  trunk Compile Tests
            +1 mvninstall 20m 23s trunk passed
            +1 compile 1m 0s trunk passed
            +1 checkstyle 0m 44s trunk passed
            +1 mvnsite 1m 11s trunk passed
            +1 shadedclient 14m 43s branch has no errors when building and testing our client artifacts.
            -1 findbugs 2m 17s hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant Findbugs warnings.
            +1 javadoc 1m 20s trunk passed
                  Patch Compile Tests
            +1 mvninstall 1m 10s the patch passed
            +1 compile 0m 57s the patch passed
            +1 javac 0m 57s the patch passed
            +1 checkstyle 0m 39s the patch passed
            +1 mvnsite 1m 0s the patch passed
            +1 whitespace 0m 1s The patch has no whitespace issues.
            +1 shadedclient 13m 31s patch has no errors when building and testing our client artifacts.
            +1 findbugs 2m 20s the patch passed
            +1 javadoc 1m 10s the patch passed
                  Other Tests
            -1 unit 99m 20s hadoop-hdfs in the patch failed.
            +1 asflicense 0m 32s The patch does not generate ASF License warnings.
            162m 55s



            Reason Tests
            Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics
              hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy
              hadoop.hdfs.server.namenode.TestNamenodeCapacityReport
              hadoop.hdfs.TestReconstructStripedFile
              hadoop.hdfs.server.namenode.TestFsck



            Subsystem Report/Notes
            Docker Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169
            JIRA Issue HDFS-15036
            JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12988378/HDFS-15036.001.patch
            Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
            uname Linux 0e77d17e1e66 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
            Build tool maven
            Personality /testptch/patchprocess/precommit/personality/provided.sh
            git revision trunk / dc66de7
            maven version: Apache Maven 3.3.9
            Default Java 1.8.0_222
            findbugs v3.1.0-RC1
            findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/28488/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
            unit https://builds.apache.org/job/PreCommit-HDFS-Build/28488/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
            Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/28488/testReport/
            Max. process+thread count 2787 (vs. ulimit of 5500)
            modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
            Console output https://builds.apache.org/job/PreCommit-HDFS-Build/28488/console
            Powered by Apache Yetus 0.8.0 http://yetus.apache.org

            This message was automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 47s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.       trunk Compile Tests +1 mvninstall 20m 23s trunk passed +1 compile 1m 0s trunk passed +1 checkstyle 0m 44s trunk passed +1 mvnsite 1m 11s trunk passed +1 shadedclient 14m 43s branch has no errors when building and testing our client artifacts. -1 findbugs 2m 17s hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant Findbugs warnings. +1 javadoc 1m 20s trunk passed       Patch Compile Tests +1 mvninstall 1m 10s the patch passed +1 compile 0m 57s the patch passed +1 javac 0m 57s the patch passed +1 checkstyle 0m 39s the patch passed +1 mvnsite 1m 0s the patch passed +1 whitespace 0m 1s The patch has no whitespace issues. +1 shadedclient 13m 31s patch has no errors when building and testing our client artifacts. +1 findbugs 2m 20s the patch passed +1 javadoc 1m 10s the patch passed       Other Tests -1 unit 99m 20s hadoop-hdfs in the patch failed. +1 asflicense 0m 32s The patch does not generate ASF License warnings. 162m 55s Reason Tests Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics   hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy   hadoop.hdfs.server.namenode.TestNamenodeCapacityReport   hadoop.hdfs.TestReconstructStripedFile   hadoop.hdfs.server.namenode.TestFsck Subsystem Report/Notes Docker Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 JIRA Issue HDFS-15036 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12988378/HDFS-15036.001.patch Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle uname Linux 0e77d17e1e66 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/patchprocess/precommit/personality/provided.sh git revision trunk / dc66de7 maven version: Apache Maven 3.3.9 Default Java 1.8.0_222 findbugs v3.1.0-RC1 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/28488/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html unit https://builds.apache.org/job/PreCommit-HDFS-Build/28488/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/28488/testReport/ Max. process+thread count 2787 (vs. ulimit of 5500) modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/28488/console Powered by Apache Yetus 0.8.0 http://yetus.apache.org This message was automatically generated.
            vagarychen Chen Liang added a comment -

            Thanks for taking a look shv! Post v002 patch. And the failed tests all passed in my local run.

            vagarychen Chen Liang added a comment - Thanks for taking a look shv ! Post v002 patch. And the failed tests all passed in my local run.
            hadoopqa Hadoop QA added a comment -
            -1 overall



            Vote Subsystem Runtime Comment
            0 reexec 0m 43s Docker mode activated.
                  Prechecks
            +1 @author 0m 0s The patch does not contain any @author tags.
            +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
                  trunk Compile Tests
            +1 mvninstall 21m 7s trunk passed
            +1 compile 1m 9s trunk passed
            +1 checkstyle 0m 46s trunk passed
            +1 mvnsite 1m 15s trunk passed
            +1 shadedclient 15m 17s branch has no errors when building and testing our client artifacts.
            -1 findbugs 2m 34s hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant Findbugs warnings.
            +1 javadoc 1m 23s trunk passed
                  Patch Compile Tests
            +1 mvninstall 1m 11s the patch passed
            +1 compile 0m 58s the patch passed
            +1 javac 0m 58s the patch passed
            -0 checkstyle 0m 40s hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 88 unchanged - 0 fixed = 89 total (was 88)
            +1 mvnsite 1m 8s the patch passed
            +1 whitespace 0m 0s The patch has no whitespace issues.
            +1 shadedclient 13m 50s patch has no errors when building and testing our client artifacts.
            +1 findbugs 2m 29s the patch passed
            +1 javadoc 1m 14s the patch passed
                  Other Tests
            -1 unit 103m 17s hadoop-hdfs in the patch failed.
            +1 asflicense 0m 38s The patch does not generate ASF License warnings.
            169m 24s



            Reason Tests
            Failed junit tests hadoop.hdfs.qjournal.client.TestQuorumJournalManager
              hadoop.hdfs.server.datanode.TestBPOfferService
              hadoop.hdfs.TestFileAppend2
              hadoop.hdfs.server.namenode.TestFsck
              hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA
              hadoop.hdfs.qjournal.client.TestQJMWithFaults
              hadoop.hdfs.TestWriteRead



            Subsystem Report/Notes
            Docker Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169
            JIRA Issue HDFS-15036
            JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12988468/HDFS-15036.002.patch
            Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
            uname Linux 21686e70fb56 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
            Build tool maven
            Personality /testptch/patchprocess/precommit/personality/provided.sh
            git revision trunk / 875a3e9
            maven version: Apache Maven 3.3.9
            Default Java 1.8.0_222
            findbugs v3.1.0-RC1
            findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/28497/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
            checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/28497/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
            unit https://builds.apache.org/job/PreCommit-HDFS-Build/28497/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
            Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/28497/testReport/
            Max. process+thread count 2270 (vs. ulimit of 5500)
            modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
            Console output https://builds.apache.org/job/PreCommit-HDFS-Build/28497/console
            Powered by Apache Yetus 0.8.0 http://yetus.apache.org

            This message was automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 43s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.       trunk Compile Tests +1 mvninstall 21m 7s trunk passed +1 compile 1m 9s trunk passed +1 checkstyle 0m 46s trunk passed +1 mvnsite 1m 15s trunk passed +1 shadedclient 15m 17s branch has no errors when building and testing our client artifacts. -1 findbugs 2m 34s hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant Findbugs warnings. +1 javadoc 1m 23s trunk passed       Patch Compile Tests +1 mvninstall 1m 11s the patch passed +1 compile 0m 58s the patch passed +1 javac 0m 58s the patch passed -0 checkstyle 0m 40s hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 88 unchanged - 0 fixed = 89 total (was 88) +1 mvnsite 1m 8s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 shadedclient 13m 50s patch has no errors when building and testing our client artifacts. +1 findbugs 2m 29s the patch passed +1 javadoc 1m 14s the patch passed       Other Tests -1 unit 103m 17s hadoop-hdfs in the patch failed. +1 asflicense 0m 38s The patch does not generate ASF License warnings. 169m 24s Reason Tests Failed junit tests hadoop.hdfs.qjournal.client.TestQuorumJournalManager   hadoop.hdfs.server.datanode.TestBPOfferService   hadoop.hdfs.TestFileAppend2   hadoop.hdfs.server.namenode.TestFsck   hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA   hadoop.hdfs.qjournal.client.TestQJMWithFaults   hadoop.hdfs.TestWriteRead Subsystem Report/Notes Docker Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 JIRA Issue HDFS-15036 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12988468/HDFS-15036.002.patch Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle uname Linux 21686e70fb56 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/patchprocess/precommit/personality/provided.sh git revision trunk / 875a3e9 maven version: Apache Maven 3.3.9 Default Java 1.8.0_222 findbugs v3.1.0-RC1 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/28497/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/28497/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/28497/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/28497/testReport/ Max. process+thread count 2270 (vs. ulimit of 5500) modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/28497/console Powered by Apache Yetus 0.8.0 http://yetus.apache.org This message was automatically generated.

            Looks good. Minor things

            1. Typo in doCheckpoint(). Removed is in:
              // by the other node. This could happen if
            2. Should use parameterized logging
              LOG.info("Image upload rejected by the other NameNode: {}", uploadResult);
            shv Konstantin Shvachko added a comment - Looks good. Minor things Typo in doCheckpoint() . Removed is in: // by the other node. This could happen if Should use parameterized logging LOG.info( "Image upload rejected by the other NameNode: {}" , uploadResult);
            vagarychen Chen Liang added a comment -

            Thanks for the review shv, uploaded v03 patch

            vagarychen Chen Liang added a comment - Thanks for the review shv , uploaded v03 patch
            hadoopqa Hadoop QA added a comment -
            -1 overall



            Vote Subsystem Runtime Comment
            0 reexec 0m 49s Docker mode activated.
                  Prechecks
            +1 @author 0m 0s The patch does not contain any @author tags.
            +1 test4tests 0m 1s The patch appears to include 1 new or modified test files.
                  trunk Compile Tests
            +1 mvninstall 23m 35s trunk passed
            +1 compile 1m 13s trunk passed
            +1 checkstyle 1m 2s trunk passed
            +1 mvnsite 1m 31s trunk passed
            +1 shadedclient 17m 50s branch has no errors when building and testing our client artifacts.
            -1 findbugs 2m 49s hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant Findbugs warnings.
            +1 javadoc 1m 30s trunk passed
                  Patch Compile Tests
            +1 mvninstall 1m 15s the patch passed
            +1 compile 1m 10s the patch passed
            +1 javac 1m 10s the patch passed
            -0 checkstyle 0m 41s hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 88 unchanged - 0 fixed = 89 total (was 88)
            +1 mvnsite 1m 20s the patch passed
            +1 whitespace 0m 0s The patch has no whitespace issues.
            +1 shadedclient 15m 6s patch has no errors when building and testing our client artifacts.
            +1 findbugs 2m 19s the patch passed
            +1 javadoc 1m 9s the patch passed
                  Other Tests
            -1 unit 107m 35s hadoop-hdfs in the patch failed.
            +1 asflicense 0m 36s The patch does not generate ASF License warnings.
            180m 45s



            Reason Tests
            Failed junit tests hadoop.hdfs.server.namenode.TestFsck



            Subsystem Report/Notes
            Docker Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169
            JIRA Issue HDFS-15036
            JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12988486/HDFS-15036.003.patch
            Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
            uname Linux 32b29ff6bfad 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
            Build tool maven
            Personality /testptch/patchprocess/precommit/personality/provided.sh
            git revision trunk / c2e9783
            maven version: Apache Maven 3.3.9
            Default Java 1.8.0_222
            findbugs v3.1.0-RC1
            findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/28499/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
            checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/28499/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
            unit https://builds.apache.org/job/PreCommit-HDFS-Build/28499/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
            Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/28499/testReport/
            Max. process+thread count 3176 (vs. ulimit of 5500)
            modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
            Console output https://builds.apache.org/job/PreCommit-HDFS-Build/28499/console
            Powered by Apache Yetus 0.8.0 http://yetus.apache.org

            This message was automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 49s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 1s The patch appears to include 1 new or modified test files.       trunk Compile Tests +1 mvninstall 23m 35s trunk passed +1 compile 1m 13s trunk passed +1 checkstyle 1m 2s trunk passed +1 mvnsite 1m 31s trunk passed +1 shadedclient 17m 50s branch has no errors when building and testing our client artifacts. -1 findbugs 2m 49s hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant Findbugs warnings. +1 javadoc 1m 30s trunk passed       Patch Compile Tests +1 mvninstall 1m 15s the patch passed +1 compile 1m 10s the patch passed +1 javac 1m 10s the patch passed -0 checkstyle 0m 41s hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 88 unchanged - 0 fixed = 89 total (was 88) +1 mvnsite 1m 20s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 shadedclient 15m 6s patch has no errors when building and testing our client artifacts. +1 findbugs 2m 19s the patch passed +1 javadoc 1m 9s the patch passed       Other Tests -1 unit 107m 35s hadoop-hdfs in the patch failed. +1 asflicense 0m 36s The patch does not generate ASF License warnings. 180m 45s Reason Tests Failed junit tests hadoop.hdfs.server.namenode.TestFsck Subsystem Report/Notes Docker Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 JIRA Issue HDFS-15036 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12988486/HDFS-15036.003.patch Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle uname Linux 32b29ff6bfad 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/patchprocess/precommit/personality/provided.sh git revision trunk / c2e9783 maven version: Apache Maven 3.3.9 Default Java 1.8.0_222 findbugs v3.1.0-RC1 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/28499/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/28499/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/28499/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/28499/testReport/ Max. process+thread count 3176 (vs. ulimit of 5500) modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/28499/console Powered by Apache Yetus 0.8.0 http://yetus.apache.org This message was automatically generated.

            +1 on v03 patch.
            TestFsck failure is tracked under HDFS-15038.
            And the checkstyle warning is bogus.

            shv Konstantin Shvachko added a comment - +1 on v03 patch. TestFsck failure is tracked under HDFS-15038 . And the checkstyle warning is bogus.
            hudson Hudson added a comment -

            SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17758 (See https://builds.apache.org/job/Hadoop-trunk-Commit/17758/)
            HDFS-15036. Active NameNode should not silently fail the image transfer. (cliang: rev 65c4660bcd897e139fc175ca438cff75ec0c6be8)

            • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ImageServlet.java
            • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java
            • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRollingUpgrade.java
            hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17758 (See https://builds.apache.org/job/Hadoop-trunk-Commit/17758/ ) HDFS-15036 . Active NameNode should not silently fail the image transfer. (cliang: rev 65c4660bcd897e139fc175ca438cff75ec0c6be8) (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ImageServlet.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRollingUpgrade.java
            vagarychen Chen Liang added a comment - - edited

            Thanks shv! I've committed to trunk and branch-2, will commit to branch-3.2 and branch-3.1 shortly as well.

            vagarychen Chen Liang added a comment - - edited Thanks shv ! I've committed to trunk and branch-2, will commit to branch-3.2 and branch-3.1 shortly as well.

            vagarychen we should commit to branch-2.10. branch-2 was deleted as per discussion on hdfs-dev.

            shv Konstantin Shvachko added a comment - vagarychen we should commit to branch-2.10. branch-2 was deleted as per discussion on hdfs-dev.
            vagarychen Chen Liang added a comment -

            Oops! Did not realize it's already deleted, guess I missed the messages... will work on deleting it again...

            vagarychen Chen Liang added a comment - Oops! Did not realize it's already deleted, guess I missed the messages... will work on deleting it again...
            jbrennan Jim Brennan added a comment -

            shv, jhung was branch-2 actually deleted?    I can still see it, and this commit is still there.

             

             

            jbrennan Jim Brennan added a comment - shv , jhung  was branch-2 actually deleted?    I can still see it, and this commit is still there.    
            vagarychen Chen Liang added a comment -

            Jim_Brennan I filed https://issues.apache.org/jira/browse/INFRA-19581, but haven't got update from Infra folks yet.

            vagarychen Chen Liang added a comment - Jim_Brennan  I filed  https://issues.apache.org/jira/browse/INFRA-19581 , but haven't got update from Infra folks yet.
            jhung Jonathan Hung added a comment -

            Pushed to branch-2.10.

            jhung Jonathan Hung added a comment - Pushed to branch-2.10.

            People

              vagarychen Chen Liang
              shv Konstantin Shvachko
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: