Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-11960

Successfully closed files can stay under-replicated.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.9.0, 3.0.0-alpha4, 2.8.2
    • Component/s: None
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      If a certain set of conditions hold at the time of a file creation, a block of the file can stay under-replicated. This is because the block is mistakenly taken out of the under-replicated block queue and never gets reevaluated.

      Re-evaluation can be triggered if

      • a replica containing node dies.
      • setrep is called
      • NN repl queues are reinitialized (NN failover or restart)

      If none of these happens, the block stays under-replicated.

      Here is how it happens.
      1) A replica is finalized, but the ACK does not reach the upstream in time. IBR is also delayed.
      2) A close recovery happens, which updates the gen stamp of "healthy" replicas.
      3) The file is closed with the healthy replicas. It is added to the replication queue.
      4) A replication is scheduled, so it is added to the pending replication list. The replication target is picked as the failed node in 1).
      5) The old IBR is finally received for the failed/excluded node. In the meantime, the replication fails, because there is already a finalized replica (with older gen stamp) on the node.
      6) The IBR processing removes the block from the pending list, adds it to corrupt replicas list, and then issues invalidation. Since the block is in neither replication queue nor pending list, it stays under-replicated.

      1. HDFS-11960.patch
        1.0 kB
        Kihwal Lee
      2. HDFS-11960-v2.branch-2.txt
        4 kB
        Kihwal Lee
      3. HDFS-11960-v2.trunk.txt
        4 kB
        Kihwal Lee

        Issue Links

          Activity

          Hide
          kihwal Kihwal Lee added a comment -

          Details of the step 6).
          processIncrementalBlockReport() calls addBlock() for the received IBR with the old gen stamp. addBlock() unconditionally decrements pending count for the block.

            void addBlock(DatanodeStorageInfo storageInfo, Block block, String delHint)
                throws IOException {
          ...
              //
              // Modify the blocks->datanode map and node's map.
              //
              pendingReplications.decrement(block, node);
              processAndHandleReportedBlock(storageInfo, block, ReplicaState.FINALIZED,
                  delHintNode);
            }
          

          In processAndHandleReportedBlock(), the replica is identified as corrupt, so markBlockAsCorrupt() is called.

            private void markBlockAsCorrupt(BlockToMarkCorrupt b,
                DatanodeStorageInfo storageInfo,
                DatanodeDescriptor node) throws IOException {
          ...
              boolean corruptedDuringWrite = minReplicationSatisfied &&
                  (b.stored.getGenerationStamp() > b.corrupted.getGenerationStamp());
              // case 1: have enough number of live replicas
              // case 2: corrupted replicas + live replicas > Replication factor
              // case 3: Block is marked corrupt due to failure while writing. In this
              //         case genstamp will be different than that of valid block.
              // In all these cases we can delete the replica.
              // In case of 3, rbw block will be deleted and valid block can be replicated
              if (hasEnoughLiveReplicas || hasMoreCorruptReplicas
                  || corruptedDuringWrite) {
                // the block is over-replicated so invalidate the replicas immediately
                invalidateBlock(b, node);
              } else if (namesystem.isPopulatingReplQueues()) {
                // add the block to neededReplication
                updateNeededReplications(b.stored, -1, 0);
              }
            }
          

          As shown above, it is considered as "case 3", which causes immediate invalidation of the corrupt block. No further check on replication is done.

          Show
          kihwal Kihwal Lee added a comment - Details of the step 6). processIncrementalBlockReport() calls addBlock() for the received IBR with the old gen stamp. addBlock() unconditionally decrements pending count for the block. void addBlock(DatanodeStorageInfo storageInfo, Block block, String delHint) throws IOException { ... // // Modify the blocks->datanode map and node's map. // pendingReplications.decrement(block, node); processAndHandleReportedBlock(storageInfo, block, ReplicaState.FINALIZED, delHintNode); } In processAndHandleReportedBlock() , the replica is identified as corrupt, so markBlockAsCorrupt() is called. private void markBlockAsCorrupt(BlockToMarkCorrupt b, DatanodeStorageInfo storageInfo, DatanodeDescriptor node) throws IOException { ... boolean corruptedDuringWrite = minReplicationSatisfied && (b.stored.getGenerationStamp() > b.corrupted.getGenerationStamp()); // case 1: have enough number of live replicas // case 2: corrupted replicas + live replicas > Replication factor // case 3: Block is marked corrupt due to failure while writing. In this // case genstamp will be different than that of valid block. // In all these cases we can delete the replica. // In case of 3, rbw block will be deleted and valid block can be replicated if (hasEnoughLiveReplicas || hasMoreCorruptReplicas || corruptedDuringWrite) { // the block is over-replicated so invalidate the replicas immediately invalidateBlock(b, node); } else if (namesystem.isPopulatingReplQueues()) { // add the block to neededReplication updateNeededReplications(b.stored, -1, 0); } } As shown above, it is considered as "case 3", which causes immediate invalidation of the corrupt block. No further check on replication is done.
          Hide
          kihwal Kihwal Lee added a comment -

          The simplest fix will be not letting addBlock() remove a pending replication, if the reported genstamp is not current.

          -    if (storedBlock != null) {
          +    if (storedBlock != null &&
          +          block.getGenerationStamp() == storedBlock.getGenerationStamp()) {
                 pendingReconstruction.decrement(storedBlock, node);
               }
          

          This way, the corrupt replica will still be deleted and if the replication is tried and fails before the deletion, the pending replication will expire and rescheduled. Even if it is scheduled to the same target again, it will work.

          Show
          kihwal Kihwal Lee added a comment - The simplest fix will be not letting addBlock() remove a pending replication, if the reported genstamp is not current. - if (storedBlock != null ) { + if (storedBlock != null && + block.getGenerationStamp() == storedBlock.getGenerationStamp()) { pendingReconstruction.decrement(storedBlock, node); } This way, the corrupt replica will still be deleted and if the replication is tried and fails before the deletion, the pending replication will expire and rescheduled. Even if it is scheduled to the same target again, it will work.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 19s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 mvninstall 14m 56s trunk passed
          +1 compile 0m 54s trunk passed
          +1 checkstyle 0m 39s trunk passed
          +1 mvnsite 1m 6s trunk passed
          +1 findbugs 1m 56s trunk passed
          +1 javadoc 0m 46s trunk passed
          +1 mvninstall 0m 51s the patch passed
          +1 compile 0m 51s the patch passed
          +1 javac 0m 51s the patch passed
          +1 checkstyle 0m 35s the patch passed
          +1 mvnsite 0m 56s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 50s the patch passed
          +1 javadoc 0m 38s the patch passed
          -1 unit 69m 6s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 19s The patch does not generate ASF License warnings.
          97m 6s



          Reason Tests
          Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy
            hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes
          Timed out junit tests org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-11960
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12872320/HDFS-11960.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux dbe1b3b56c89 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 99634d1
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19854/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19854/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19854/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 19s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 14m 56s trunk passed +1 compile 0m 54s trunk passed +1 checkstyle 0m 39s trunk passed +1 mvnsite 1m 6s trunk passed +1 findbugs 1m 56s trunk passed +1 javadoc 0m 46s trunk passed +1 mvninstall 0m 51s the patch passed +1 compile 0m 51s the patch passed +1 javac 0m 51s the patch passed +1 checkstyle 0m 35s the patch passed +1 mvnsite 0m 56s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 50s the patch passed +1 javadoc 0m 38s the patch passed -1 unit 69m 6s hadoop-hdfs in the patch failed. +1 asflicense 0m 19s The patch does not generate ASF License warnings. 97m 6s Reason Tests Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy   hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes Timed out junit tests org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-11960 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12872320/HDFS-11960.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux dbe1b3b56c89 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 99634d1 Default Java 1.8.0_131 findbugs v3.1.0-RC1 unit https://builds.apache.org/job/PreCommit-HDFS-Build/19854/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19854/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19854/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          kihwal Kihwal Lee added a comment -

          All failed tests were fine when reran.

          -------------------------------------------------------
           T E S T S
          -------------------------------------------------------
          Running org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
          Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.489 sec
           - in org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
          Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes
          Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 81.012 sec
           - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes
          Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy
          Tests run: 14, Failures: 0, Errors: 0, Skipped: 10, Time elapsed: 78.64 sec
           - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy
          
          Show
          kihwal Kihwal Lee added a comment - All failed tests were fine when reran. ------------------------------------------------------- T E S T S ------------------------------------------------------- Running org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.489 sec - in org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 81.012 sec - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy Tests run: 14, Failures: 0, Errors: 0, Skipped: 10, Time elapsed: 78.64 sec - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy
          Hide
          daryn Daryn Sharp added a comment -

          Looks like a good fix but there really should be a test to prevent a regression. Chasing down all the not-replicating issues has been a pain over the years...

          Show
          daryn Daryn Sharp added a comment - Looks like a good fix but there really should be a test to prevent a regression. Chasing down all the not-replicating issues has been a pain over the years...
          Hide
          kihwal Kihwal Lee added a comment -

          Added unit test.

          Show
          kihwal Kihwal Lee added a comment - Added unit test.
          Hide
          kihwal Kihwal Lee added a comment -

          The branch-2 patch is identical except for the name change from "Reconstruction" to "Replication".

          Show
          kihwal Kihwal Lee added a comment - The branch-2 patch is identical except for the name change from "Reconstruction" to "Replication".
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 16s Docker mode activated.
          0 patch 0m 1s The patch file was not named according to hadoop's naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute for instructions.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 15m 1s trunk passed
          +1 compile 1m 5s trunk passed
          +1 checkstyle 0m 46s trunk passed
          +1 mvnsite 1m 7s trunk passed
          +1 findbugs 2m 10s trunk passed
          +1 javadoc 0m 45s trunk passed
          +1 mvninstall 0m 53s the patch passed
          +1 compile 0m 50s the patch passed
          +1 javac 0m 50s the patch passed
          -0 checkstyle 0m 35s hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 127 unchanged - 0 fixed = 129 total (was 127)
          +1 mvnsite 0m 54s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 2m 10s the patch passed
          +1 javadoc 0m 40s the patch passed
          +1 unit 97m 5s hadoop-hdfs in the patch passed.
          +1 asflicense 0m 21s The patch does not generate ASF License warnings.
          126m 0s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-11960
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12873547/HDFS-11960-v2.trunk.txt
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux bb2f07bdf0b9 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 73fb750
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19954/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19954/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19954/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 16s Docker mode activated. 0 patch 0m 1s The patch file was not named according to hadoop's naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute for instructions. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 15m 1s trunk passed +1 compile 1m 5s trunk passed +1 checkstyle 0m 46s trunk passed +1 mvnsite 1m 7s trunk passed +1 findbugs 2m 10s trunk passed +1 javadoc 0m 45s trunk passed +1 mvninstall 0m 53s the patch passed +1 compile 0m 50s the patch passed +1 javac 0m 50s the patch passed -0 checkstyle 0m 35s hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 127 unchanged - 0 fixed = 129 total (was 127) +1 mvnsite 0m 54s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 10s the patch passed +1 javadoc 0m 40s the patch passed +1 unit 97m 5s hadoop-hdfs in the patch passed. +1 asflicense 0m 21s The patch does not generate ASF License warnings. 126m 0s Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-11960 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12873547/HDFS-11960-v2.trunk.txt Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux bb2f07bdf0b9 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 73fb750 Default Java 1.8.0_131 findbugs v3.1.0-RC1 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19954/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19954/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19954/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 19m 22s Docker mode activated.
          0 patch 0m 1s The patch file was not named according to hadoop's naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute for instructions.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 6m 22s branch-2 passed
          +1 compile 0m 34s branch-2 passed with JDK v1.8.0_131
          +1 compile 0m 38s branch-2 passed with JDK v1.7.0_131
          +1 checkstyle 0m 25s branch-2 passed
          +1 mvnsite 0m 48s branch-2 passed
          +1 findbugs 1m 51s branch-2 passed
          +1 javadoc 0m 34s branch-2 passed with JDK v1.8.0_131
          +1 javadoc 0m 54s branch-2 passed with JDK v1.7.0_131
          +1 mvninstall 0m 41s the patch passed
          +1 compile 0m 36s the patch passed with JDK v1.8.0_131
          +1 javac 0m 36s the patch passed
          +1 compile 0m 39s the patch passed with JDK v1.7.0_131
          +1 javac 0m 39s the patch passed
          -0 checkstyle 0m 22s hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 146 unchanged - 0 fixed = 148 total (was 146)
          +1 mvnsite 0m 48s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 57s the patch passed
          +1 javadoc 0m 34s the patch passed with JDK v1.8.0_131
          +1 javadoc 0m 52s the patch passed with JDK v1.7.0_131
          -1 unit 49m 45s hadoop-hdfs in the patch failed with JDK v1.7.0_131.
          -1 asflicense 0m 18s The patch generated 1 ASF License warnings.
          149m 31s



          Reason Tests
          JDK v1.8.0_131 Failed junit tests hadoop.hdfs.web.TestWebHdfsTimeouts
          JDK v1.8.0_131 Timed out junit tests org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
            org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
          JDK v1.7.0_131 Failed junit tests hadoop.hdfs.web.TestWebHdfsTimeouts
            hadoop.hdfs.server.namenode.TestDecommissioningStatus
            hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain
            hadoop.hdfs.server.namenode.ha.TestHASafeMode



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:5e40efe
          JIRA Issue HDFS-11960
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12873559/HDFS-11960-v2.branch-2.txt
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 32a648688be4 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision branch-2 / 71626fd
          Default Java 1.7.0_131
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_131 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_131
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19960/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19960/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_131.txt
          JDK v1.7.0_131 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19960/testReport/
          asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/19960/artifact/patchprocess/patch-asflicense-problems.txt
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19960/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 19m 22s Docker mode activated. 0 patch 0m 1s The patch file was not named according to hadoop's naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute for instructions. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 22s branch-2 passed +1 compile 0m 34s branch-2 passed with JDK v1.8.0_131 +1 compile 0m 38s branch-2 passed with JDK v1.7.0_131 +1 checkstyle 0m 25s branch-2 passed +1 mvnsite 0m 48s branch-2 passed +1 findbugs 1m 51s branch-2 passed +1 javadoc 0m 34s branch-2 passed with JDK v1.8.0_131 +1 javadoc 0m 54s branch-2 passed with JDK v1.7.0_131 +1 mvninstall 0m 41s the patch passed +1 compile 0m 36s the patch passed with JDK v1.8.0_131 +1 javac 0m 36s the patch passed +1 compile 0m 39s the patch passed with JDK v1.7.0_131 +1 javac 0m 39s the patch passed -0 checkstyle 0m 22s hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 146 unchanged - 0 fixed = 148 total (was 146) +1 mvnsite 0m 48s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 57s the patch passed +1 javadoc 0m 34s the patch passed with JDK v1.8.0_131 +1 javadoc 0m 52s the patch passed with JDK v1.7.0_131 -1 unit 49m 45s hadoop-hdfs in the patch failed with JDK v1.7.0_131. -1 asflicense 0m 18s The patch generated 1 ASF License warnings. 149m 31s Reason Tests JDK v1.8.0_131 Failed junit tests hadoop.hdfs.web.TestWebHdfsTimeouts JDK v1.8.0_131 Timed out junit tests org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting   org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean JDK v1.7.0_131 Failed junit tests hadoop.hdfs.web.TestWebHdfsTimeouts   hadoop.hdfs.server.namenode.TestDecommissioningStatus   hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain   hadoop.hdfs.server.namenode.ha.TestHASafeMode Subsystem Report/Notes Docker Image:yetus/hadoop:5e40efe JIRA Issue HDFS-11960 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12873559/HDFS-11960-v2.branch-2.txt Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 32a648688be4 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision branch-2 / 71626fd Default Java 1.7.0_131 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_131 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_131 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19960/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/19960/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_131.txt JDK v1.7.0_131 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19960/testReport/ asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/19960/artifact/patchprocess/patch-asflicense-problems.txt modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19960/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          daryn Daryn Sharp added a comment -

          +1 Looks good. It's great to have finally solved this vexing problem with persistent under-replication!

          Show
          daryn Daryn Sharp added a comment - +1 Looks good. It's great to have finally solved this vexing problem with persistent under-replication!
          Hide
          kihwal Kihwal Lee added a comment -

          Thanks for the review, Daryn. I've committed this to trunk and branch-2. On branch-2.8, the test fails due to an extra check in BlockManager. I will update the patch.

          Show
          kihwal Kihwal Lee added a comment - Thanks for the review, Daryn. I've committed this to trunk and branch-2. On branch-2.8, the test fails due to an extra check in BlockManager. I will update the patch.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11894 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11894/)
          HDFS-11960. Successfully closed files can stay under-replicated. (kihwal: rev 8c0769dee4b455f4de08ccce36334f0be9e79e2c)

          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestPendingReconstruction.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11894 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11894/ ) HDFS-11960 . Successfully closed files can stay under-replicated. (kihwal: rev 8c0769dee4b455f4de08ccce36334f0be9e79e2c) (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestPendingReconstruction.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          Hide
          kihwal Kihwal Lee added a comment -

          HDFS-9754 was needed for the test. Cherry-picked HDFS-9754 and this jira to branch-2.8. The test passes now.

          Show
          kihwal Kihwal Lee added a comment - HDFS-9754 was needed for the test. Cherry-picked HDFS-9754 and this jira to branch-2.8. The test passes now.

            People

            • Assignee:
              kihwal Kihwal Lee
              Reporter:
              kihwal Kihwal Lee
            • Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development