Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-4937

ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()

    Details

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      When a large number of nodes are removed by refreshing node lists, the network topology is updated. If the refresh happens at the right moment, the replication monitor thread may stuck in the while loop of chooseRandom(). This is because the cached cluster size is used in the terminal condition check of the loop. This usually happens when a block with a high replication factor is being processed. Since replicas/rack is also calculated beforehand, no node choice may satisfy the goodness criteria if refreshing removed racks.

      All nodes will end up in the excluded list, but the size will still be less than the cached cluster size, so it will loop infinitely. This was observed in a production environment.

      1. HDFS-4937.patch
        1 kB
        Kihwal Lee
      2. HDFS-4937.v1.patch
        2 kB
        Kihwal Lee
      3. HDFS-4937.v1.patch
        2 kB
        Kihwal Lee
      4. HDFS-4937.v3.patch
        2 kB
        Kihwal Lee

        Issue Links

          Activity

          Hide
          patibandlas2 Siva Teja Patibandla added a comment -

          Hi Kihwal, was the v3 patch tested? it seems the whole function chooseRandom() got rewritten in later releases so the fix may not have gotten much test mileage so whether I should use it or not.

          Show
          patibandlas2 Siva Teja Patibandla added a comment - Hi Kihwal, was the v3 patch tested? it seems the whole function chooseRandom() got rewritten in later releases so the fix may not have gotten much test mileage so whether I should use it or not.
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Closing the JIRA as part of 2.7.3 release.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Closing the JIRA as part of 2.7.3 release.
          Hide
          hudson Hudson added a comment -

          ABORTED: Integrated in Hadoop-Hdfs-trunk-Java8 #574 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/574/)
          HDFS-4937. ReplicationMonitor can infinite-loop in (kihwal: rev ff47f35deed14ba6463cba76f0e6a6c15abb3eca)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Show
          hudson Hudson added a comment - ABORTED: Integrated in Hadoop-Hdfs-trunk-Java8 #574 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/574/ ) HDFS-4937 . ReplicationMonitor can infinite-loop in (kihwal: rev ff47f35deed14ba6463cba76f0e6a6c15abb3eca) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #631 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/631/)
          HDFS-4937. ReplicationMonitor can infinite-loop in (kihwal: rev ff47f35deed14ba6463cba76f0e6a6c15abb3eca)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #631 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/631/ ) HDFS-4937 . ReplicationMonitor can infinite-loop in (kihwal: rev ff47f35deed14ba6463cba76f0e6a6c15abb3eca) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #2571 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2571/)
          HDFS-4937. ReplicationMonitor can infinite-loop in (kihwal: rev ff47f35deed14ba6463cba76f0e6a6c15abb3eca)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2571 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2571/ ) HDFS-4937 . ReplicationMonitor can infinite-loop in (kihwal: rev ff47f35deed14ba6463cba76f0e6a6c15abb3eca) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #1365 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1365/)
          HDFS-4937. ReplicationMonitor can infinite-loop in (kihwal: rev ff47f35deed14ba6463cba76f0e6a6c15abb3eca)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #1365 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1365/ ) HDFS-4937 . ReplicationMonitor can infinite-loop in (kihwal: rev ff47f35deed14ba6463cba76f0e6a6c15abb3eca) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #641 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/641/)
          HDFS-4937. ReplicationMonitor can infinite-loop in (kihwal: rev ff47f35deed14ba6463cba76f0e6a6c15abb3eca)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #641 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/641/ ) HDFS-4937 . ReplicationMonitor can infinite-loop in (kihwal: rev ff47f35deed14ba6463cba76f0e6a6c15abb3eca) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2512 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2512/)
          HDFS-4937. ReplicationMonitor can infinite-loop in (kihwal: rev ff47f35deed14ba6463cba76f0e6a6c15abb3eca)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2512 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2512/ ) HDFS-4937 . ReplicationMonitor can infinite-loop in (kihwal: rev ff47f35deed14ba6463cba76f0e6a6c15abb3eca) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #8760 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8760/)
          HDFS-4937. ReplicationMonitor can infinite-loop in (kihwal: rev ff47f35deed14ba6463cba76f0e6a6c15abb3eca)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8760 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8760/ ) HDFS-4937 . ReplicationMonitor can infinite-loop in (kihwal: rev ff47f35deed14ba6463cba76f0e6a6c15abb3eca) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          kihwal Kihwal Lee added a comment -

          Thanks for the reviews and reporting mistakes, gentlemen. I've committed this to trunk, branch-2 and branch-2.7. Long live Hadoop.

          Show
          kihwal Kihwal Lee added a comment - Thanks for the reviews and reporting mistakes, gentlemen. I've committed this to trunk, branch-2 and branch-2.7. Long live Hadoop.
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          Yes,V1 Patch LGTM, +1 (non-binding).

          Show
          brahmareddy Brahma Reddy Battula added a comment - Yes,V1 Patch LGTM, +1 (non-binding).
          Hide
          hitliuyi Yi Liu added a comment -

          This time, it's correct now. The logic of current patch is straight.

          +1 for the v1 patch, thanks Kihwal.

          Show
          hitliuyi Yi Liu added a comment - This time, it's correct now. The logic of current patch is straight. +1 for the v1 patch, thanks Kihwal.
          Hide
          kihwal Kihwal Lee added a comment -

          I will file a jira for this.

          HDFS-9376

          Show
          kihwal Kihwal Lee added a comment - I will file a jira for this. HDFS-9376
          Hide
          kihwal Kihwal Lee added a comment -

          First of all, the precommit build ran 4,075 test cases, so I think it ran all of them this time.

          The test failures are not related to the patch. I've rerun the failed tests and only TestSeveralNameNodes were failing occasionally. It was timing out waiting for a thread to finish writing. This test has been failing in other precommit builds as well. When I increase the timeout, it passed 100% of times. I will file a jira for this.

          -------------------------------------------------------
          T E S T S
          -------------------------------------------------------
          Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0
          Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes
          Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 62.298 sec - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes
          Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0
          Running org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer
          Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 12.295 sec - in org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer
          Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0
          Running org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes
          Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 157.484 sec - in org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes
          Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0
          Running org.apache.hadoop.hdfs.TestLeaseRecovery2
          Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 73.445 sec - in org.apache.hadoop.hdfs.TestLeaseRecovery2
          Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0
          Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160
          Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 98.315 sec - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160
          Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0
          Running org.apache.hadoop.hdfs.TestCrcCorruption
          Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 30.387 sec - in org.apache.hadoop.hdfs.TestCrcCorruption
          Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0
          Running org.apache.hadoop.hdfs.security.TestDelegationTokenForProxyUser
          Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.775 sec - in org.apache.hadoop.hdfs.security.TestDelegationTokenForProxyUser

          Show
          kihwal Kihwal Lee added a comment - First of all, the precommit build ran 4,075 test cases, so I think it ran all of them this time. The test failures are not related to the patch. I've rerun the failed tests and only TestSeveralNameNodes were failing occasionally. It was timing out waiting for a thread to finish writing. This test has been failing in other precommit builds as well. When I increase the timeout, it passed 100% of times. I will file a jira for this. ------------------------------------------------------- T E S T S ------------------------------------------------------- Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0 Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 62.298 sec - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0 Running org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 12.295 sec - in org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0 Running org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 157.484 sec - in org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0 Running org.apache.hadoop.hdfs.TestLeaseRecovery2 Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 73.445 sec - in org.apache.hadoop.hdfs.TestLeaseRecovery2 Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0 Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160 Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 98.315 sec - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160 Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0 Running org.apache.hadoop.hdfs.TestCrcCorruption Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 30.387 sec - in org.apache.hadoop.hdfs.TestCrcCorruption Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0 Running org.apache.hadoop.hdfs.security.TestDelegationTokenForProxyUser Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.775 sec - in org.apache.hadoop.hdfs.security.TestDelegationTokenForProxyUser
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 12s docker + precommit patch detected.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 mvninstall 3m 32s trunk passed
          +1 compile 0m 46s trunk passed with JDK v1.8.0_60
          +1 compile 0m 40s trunk passed with JDK v1.7.0_79
          +1 checkstyle 0m 20s trunk passed
          +1 mvneclipse 0m 18s trunk passed
          -1 findbugs 2m 22s hadoop-hdfs-project/hadoop-hdfs in trunk cannot run convertXmlToText from findbugs
          +1 javadoc 1m 32s trunk passed with JDK v1.8.0_60
          +1 javadoc 2m 31s trunk passed with JDK v1.7.0_79
          +1 mvninstall 0m 51s the patch passed
          +1 compile 0m 46s the patch passed with JDK v1.8.0_60
          +1 javac 0m 46s the patch passed
          +1 compile 0m 44s the patch passed with JDK v1.7.0_79
          +1 javac 0m 44s the patch passed
          +1 checkstyle 0m 19s the patch passed
          +1 mvneclipse 0m 18s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 2m 30s the patch passed
          +1 javadoc 1m 31s the patch passed with JDK v1.8.0_60
          +1 javadoc 2m 30s the patch passed with JDK v1.7.0_79
          -1 unit 85m 6s hadoop-hdfs in the patch failed with JDK v1.8.0_60.
          -1 unit 78m 34s hadoop-hdfs in the patch failed with JDK v1.7.0_79.
          -1 asflicense 0m 25s Patch generated 56 ASF License warnings.
          189m 15s



          Reason Tests
          JDK v1.8.0_60 Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160
            hadoop.hdfs.TestCrcCorruption
            hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes
            hadoop.hdfs.TestLeaseRecovery2
            hadoop.hdfs.security.TestDelegationTokenForProxyUser
            hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes
            hadoop.hdfs.server.namenode.ha.TestEditLogTailer
          JDK v1.7.0_79 Failed junit tests hadoop.hdfs.server.namenode.ha.TestDNFencing
            hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160
            hadoop.hdfs.security.TestDelegationTokenForProxyUser
            hadoop.hdfs.server.datanode.TestDirectoryScanner
            hadoop.hdfs.TestEncryptionZones



          Subsystem Report/Notes
          Docker Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-11-04
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12770435/HDFS-4937.v1.patch
          JIRA Issue HDFS-4937
          Optional Tests asflicense javac javadoc mvninstall unit findbugs checkstyle compile
          uname Linux 5646a82bf393 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/patchprocess/apache-yetus-1a9afee/precommit/personality/hadoop.sh
          git revision trunk / dac0463
          Default Java 1.7.0_79
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_60 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_79
          findbugs v3.0.0
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/13367/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs.html
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/13367/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_60.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/13367/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_79.txt
          unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/13367/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_60.txt https://builds.apache.org/job/PreCommit-HDFS-Build/13367/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_79.txt
          JDK v1.7.0_79 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/13367/testReport/
          asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/13367/artifact/patchprocess/patch-asflicense-problems.txt
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Max memory used 228MB
          Powered by Apache Yetus http://yetus.apache.org
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/13367/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 12s docker + precommit patch detected. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 3m 32s trunk passed +1 compile 0m 46s trunk passed with JDK v1.8.0_60 +1 compile 0m 40s trunk passed with JDK v1.7.0_79 +1 checkstyle 0m 20s trunk passed +1 mvneclipse 0m 18s trunk passed -1 findbugs 2m 22s hadoop-hdfs-project/hadoop-hdfs in trunk cannot run convertXmlToText from findbugs +1 javadoc 1m 32s trunk passed with JDK v1.8.0_60 +1 javadoc 2m 31s trunk passed with JDK v1.7.0_79 +1 mvninstall 0m 51s the patch passed +1 compile 0m 46s the patch passed with JDK v1.8.0_60 +1 javac 0m 46s the patch passed +1 compile 0m 44s the patch passed with JDK v1.7.0_79 +1 javac 0m 44s the patch passed +1 checkstyle 0m 19s the patch passed +1 mvneclipse 0m 18s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 2m 30s the patch passed +1 javadoc 1m 31s the patch passed with JDK v1.8.0_60 +1 javadoc 2m 30s the patch passed with JDK v1.7.0_79 -1 unit 85m 6s hadoop-hdfs in the patch failed with JDK v1.8.0_60. -1 unit 78m 34s hadoop-hdfs in the patch failed with JDK v1.7.0_79. -1 asflicense 0m 25s Patch generated 56 ASF License warnings. 189m 15s Reason Tests JDK v1.8.0_60 Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160   hadoop.hdfs.TestCrcCorruption   hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes   hadoop.hdfs.TestLeaseRecovery2   hadoop.hdfs.security.TestDelegationTokenForProxyUser   hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes   hadoop.hdfs.server.namenode.ha.TestEditLogTailer JDK v1.7.0_79 Failed junit tests hadoop.hdfs.server.namenode.ha.TestDNFencing   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160   hadoop.hdfs.security.TestDelegationTokenForProxyUser   hadoop.hdfs.server.datanode.TestDirectoryScanner   hadoop.hdfs.TestEncryptionZones Subsystem Report/Notes Docker Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-11-04 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12770435/HDFS-4937.v1.patch JIRA Issue HDFS-4937 Optional Tests asflicense javac javadoc mvninstall unit findbugs checkstyle compile uname Linux 5646a82bf393 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/patchprocess/apache-yetus-1a9afee/precommit/personality/hadoop.sh git revision trunk / dac0463 Default Java 1.7.0_79 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_60 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_79 findbugs v3.0.0 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/13367/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs.html unit https://builds.apache.org/job/PreCommit-HDFS-Build/13367/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_60.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/13367/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_79.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/13367/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_60.txt https://builds.apache.org/job/PreCommit-HDFS-Build/13367/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_79.txt JDK v1.7.0_79 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/13367/testReport/ asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/13367/artifact/patchprocess/patch-asflicense-problems.txt modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Max memory used 228MB Powered by Apache Yetus http://yetus.apache.org Console output https://builds.apache.org/job/PreCommit-HDFS-Build/13367/console This message was automatically generated.
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Moving out all non-critical / non-blocker issues that didn't make it out of 2.7.2 into 2.7.3.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Moving out all non-critical / non-blocker issues that didn't make it out of 2.7.2 into 2.7.3.
          Hide
          kihwal Kihwal Lee added a comment - - edited

          Actually the patch I wrote 2 years ago and the rebased one are correct. My subsequent attempt to "improve" it was based on my recent incorrect understanding of the patch. Now I remember why I did that way.

          After a sufficient number of random, potentially duplicate picking is tried, the total candidate node count is refreshed. The refreshed number will not include what is already tried and excluded, so it is truly the remaining candidate node count based on the list of nodes that it already tried and the latest network topology. The loop will continue until all candidate nodes are exhausted or enough number of replicas are picked.

          Resubmitting v1, the rebased original patch.

          Show
          kihwal Kihwal Lee added a comment - - edited Actually the patch I wrote 2 years ago and the rebased one are correct. My subsequent attempt to "improve" it was based on my recent incorrect understanding of the patch. Now I remember why I did that way. After a sufficient number of random, potentially duplicate picking is tried, the total candidate node count is refreshed. The refreshed number will not include what is already tried and excluded, so it is truly the remaining candidate node count based on the list of nodes that it already tried and the latest network topology. The loop will continue until all candidate nodes are exhausted or enough number of replicas are picked. Resubmitting v1, the rebased original patch.
          Hide
          kihwal Kihwal Lee added a comment -

          The test failures are definitely related. When I run TestReplicationPolicy, different cases fail depending on test ordering. One failure might be affecting other cases.

          Show
          kihwal Kihwal Lee added a comment - The test failures are definitely related. When I run TestReplicationPolicy , different cases fail depending on test ordering. One failure might be affecting other cases.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 6s docker + precommit patch detected.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 mvninstall 3m 10s trunk passed
          +1 compile 0m 38s trunk passed with JDK v1.8.0_60
          +1 compile 0m 31s trunk passed with JDK v1.7.0_79
          +1 checkstyle 0m 16s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          -1 findbugs 1m 49s hadoop-hdfs-project/hadoop-hdfs in trunk cannot run convertXmlToText from findbugs
          +1 javadoc 1m 8s trunk passed with JDK v1.8.0_60
          +1 javadoc 1m 45s trunk passed with JDK v1.7.0_79
          +1 mvninstall 0m 37s the patch passed
          +1 compile 0m 32s the patch passed with JDK v1.8.0_60
          +1 javac 0m 32s the patch passed
          +1 compile 0m 30s the patch passed with JDK v1.7.0_79
          +1 javac 0m 30s the patch passed
          +1 checkstyle 0m 15s the patch passed
          +1 mvneclipse 0m 13s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 2m 0s the patch passed
          +1 javadoc 1m 4s the patch passed with JDK v1.8.0_60
          +1 javadoc 1m 46s the patch passed with JDK v1.7.0_79
          -1 unit 50m 21s hadoop-hdfs in the patch failed with JDK v1.8.0_60.
          -1 unit 49m 46s hadoop-hdfs in the patch failed with JDK v1.7.0_79.
          -1 asflicense 0m 20s Patch generated 58 ASF License warnings.
          119m 34s



          Reason Tests
          JDK v1.8.0_60 Failed junit tests hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithNodeGroup
            hadoop.hdfs.server.blockmanagement.TestReplicationPolicy
            hadoop.hdfs.server.blockmanagement.TestBlockManager
            hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain
            hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes
            hadoop.hdfs.server.blockmanagement.TestReplicationPolicyConsiderLoad
          JDK v1.7.0_79 Failed junit tests hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithNodeGroup
            hadoop.hdfs.server.namenode.ha.TestEditLogTailer
            hadoop.hdfs.server.blockmanagement.TestReplicationPolicy
            hadoop.hdfs.TestDecommission
            hadoop.hdfs.server.blockmanagement.TestBlockManager



          Subsystem Report/Notes
          Docker Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-11-02
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12770113/HDFS-4937.v3.patch
          JIRA Issue HDFS-4937
          Optional Tests asflicense javac javadoc mvninstall unit findbugs checkstyle compile
          uname Linux 9859bf70b55d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/patchprocess/apache-yetus-e77b1ce/precommit/personality/hadoop.sh
          git revision trunk / 9e7dcab
          Default Java 1.7.0_79
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_60 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_79
          findbugs v3.0.0
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/13334/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs.html
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/13334/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_60.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/13334/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_79.txt
          unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/13334/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_60.txt https://builds.apache.org/job/PreCommit-HDFS-Build/13334/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_79.txt
          JDK v1.7.0_79 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/13334/testReport/
          asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/13334/artifact/patchprocess/patch-asflicense-problems.txt
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Max memory used 225MB
          Powered by Apache Yetus http://yetus.apache.org
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/13334/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 6s docker + precommit patch detected. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 3m 10s trunk passed +1 compile 0m 38s trunk passed with JDK v1.8.0_60 +1 compile 0m 31s trunk passed with JDK v1.7.0_79 +1 checkstyle 0m 16s trunk passed +1 mvneclipse 0m 14s trunk passed -1 findbugs 1m 49s hadoop-hdfs-project/hadoop-hdfs in trunk cannot run convertXmlToText from findbugs +1 javadoc 1m 8s trunk passed with JDK v1.8.0_60 +1 javadoc 1m 45s trunk passed with JDK v1.7.0_79 +1 mvninstall 0m 37s the patch passed +1 compile 0m 32s the patch passed with JDK v1.8.0_60 +1 javac 0m 32s the patch passed +1 compile 0m 30s the patch passed with JDK v1.7.0_79 +1 javac 0m 30s the patch passed +1 checkstyle 0m 15s the patch passed +1 mvneclipse 0m 13s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 2m 0s the patch passed +1 javadoc 1m 4s the patch passed with JDK v1.8.0_60 +1 javadoc 1m 46s the patch passed with JDK v1.7.0_79 -1 unit 50m 21s hadoop-hdfs in the patch failed with JDK v1.8.0_60. -1 unit 49m 46s hadoop-hdfs in the patch failed with JDK v1.7.0_79. -1 asflicense 0m 20s Patch generated 58 ASF License warnings. 119m 34s Reason Tests JDK v1.8.0_60 Failed junit tests hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithNodeGroup   hadoop.hdfs.server.blockmanagement.TestReplicationPolicy   hadoop.hdfs.server.blockmanagement.TestBlockManager   hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain   hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes   hadoop.hdfs.server.blockmanagement.TestReplicationPolicyConsiderLoad JDK v1.7.0_79 Failed junit tests hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithNodeGroup   hadoop.hdfs.server.namenode.ha.TestEditLogTailer   hadoop.hdfs.server.blockmanagement.TestReplicationPolicy   hadoop.hdfs.TestDecommission   hadoop.hdfs.server.blockmanagement.TestBlockManager Subsystem Report/Notes Docker Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-11-02 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12770113/HDFS-4937.v3.patch JIRA Issue HDFS-4937 Optional Tests asflicense javac javadoc mvninstall unit findbugs checkstyle compile uname Linux 9859bf70b55d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/patchprocess/apache-yetus-e77b1ce/precommit/personality/hadoop.sh git revision trunk / 9e7dcab Default Java 1.7.0_79 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_60 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_79 findbugs v3.0.0 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/13334/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs.html unit https://builds.apache.org/job/PreCommit-HDFS-Build/13334/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_60.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/13334/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_79.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/13334/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_60.txt https://builds.apache.org/job/PreCommit-HDFS-Build/13334/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_79.txt JDK v1.7.0_79 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/13334/testReport/ asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/13334/artifact/patchprocess/patch-asflicense-problems.txt modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Max memory used 225MB Powered by Apache Yetus http://yetus.apache.org Console output https://builds.apache.org/job/PreCommit-HDFS-Build/13334/console This message was automatically generated.
          Hide
          kihwal Kihwal Lee added a comment -

          So sorry about the spectacular 118 test failures! It should have refreshed the count with an empty exclude node set to obtain the correct count. Looks like a few failed test cases are passing with the change. Let's see if the precommit agrees.

          Show
          kihwal Kihwal Lee added a comment - So sorry about the spectacular 118 test failures! It should have refreshed the count with an empty exclude node set to obtain the correct count. Looks like a few failed test cases are passing with the change. Let's see if the precommit agrees.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2497 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2497/)
          Revert "HDFS-4937. ReplicationMonitor can infinite-loop in (yliu: rev 7fd6416759cbb202ed21b47d28c1587e04a5cdc6)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2497 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2497/ ) Revert " HDFS-4937 . ReplicationMonitor can infinite-loop in (yliu: rev 7fd6416759cbb202ed21b47d28c1587e04a5cdc6) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #560 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/560/)
          Revert "HDFS-4937. ReplicationMonitor can infinite-loop in (yliu: rev 7fd6416759cbb202ed21b47d28c1587e04a5cdc6)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #560 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/560/ ) Revert " HDFS-4937 . ReplicationMonitor can infinite-loop in (yliu: rev 7fd6416759cbb202ed21b47d28c1587e04a5cdc6) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #2554 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2554/)
          Revert "HDFS-4937. ReplicationMonitor can infinite-loop in (yliu: rev 7fd6416759cbb202ed21b47d28c1587e04a5cdc6)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2554 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2554/ ) Revert " HDFS-4937 . ReplicationMonitor can infinite-loop in (yliu: rev 7fd6416759cbb202ed21b47d28c1587e04a5cdc6) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #1347 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1347/)
          Revert "HDFS-4937. ReplicationMonitor can infinite-loop in (yliu: rev 7fd6416759cbb202ed21b47d28c1587e04a5cdc6)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #1347 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1347/ ) Revert " HDFS-4937 . ReplicationMonitor can infinite-loop in (yliu: rev 7fd6416759cbb202ed21b47d28c1587e04a5cdc6) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #612 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/612/)
          Revert "HDFS-4937. ReplicationMonitor can infinite-loop in (yliu: rev 7fd6416759cbb202ed21b47d28c1587e04a5cdc6)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #612 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/612/ ) Revert " HDFS-4937 . ReplicationMonitor can infinite-loop in (yliu: rev 7fd6416759cbb202ed21b47d28c1587e04a5cdc6) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #624 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/624/)
          Revert "HDFS-4937. ReplicationMonitor can infinite-loop in (yliu: rev 7fd6416759cbb202ed21b47d28c1587e04a5cdc6)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #624 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/624/ ) Revert " HDFS-4937 . ReplicationMonitor can infinite-loop in (yliu: rev 7fd6416759cbb202ed21b47d28c1587e04a5cdc6) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #8738 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8738/)
          Revert "HDFS-4937. ReplicationMonitor can infinite-loop in (yliu: rev 7fd6416759cbb202ed21b47d28c1587e04a5cdc6)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8738 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8738/ ) Revert " HDFS-4937 . ReplicationMonitor can infinite-loop in (yliu: rev 7fd6416759cbb202ed21b47d28c1587e04a5cdc6) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Hide
          hitliuyi Yi Liu added a comment - - edited

          I did consider the situation you mentioned, But I thought in real env the NN could find other racks/DNs if it has gone through enough (not all) number of nodes. But I missed the fact that many tests may only contain few available DNs, and refreshCounter <= excludedNodes.size() will be true, also in real env this also may happen if total number of DNs is few. So the patch should not be correct for these cases, revert them.

          Show
          hitliuyi Yi Liu added a comment - - edited I did consider the situation you mentioned, But I thought in real env the NN could find other racks/DNs if it has gone through enough (not all) number of nodes. But I missed the fact that many tests may only contain few available DNs, and refreshCounter <= excludedNodes.size() will be true, also in real env this also may happen if total number of DNs is few. So the patch should not be correct for these cases, revert them.
          Hide
          hitliuyi Yi Liu added a comment - - edited

          Revert from trunk, branch-2, branch-2.7. Thanks Vinay and Brahma.

          I thought the tests passed... But actually the jenkins doesn't include the tests result.

          Show
          hitliuyi Yi Liu added a comment - - edited Revert from trunk, branch-2, branch-2.7. Thanks Vinay and Brahma. I thought the tests passed... But actually the jenkins doesn't include the tests result.
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          I don't know if I can give a -1. But shall we revert this? A low of tests are broken because of it.

          662 int refreshCounter = numOfAvailableNodes;
          ...
          671 while(numOfReplicas > 0 && numOfAvailableNodes > 0) {
          672   DatanodeDescriptor chosenNode = chooseDataNode(scope);
          673   if (excludedNodes.add(chosenNode)) { //was not in the excluded list
          674     if (LOG.isDebugEnabled()) {
          675       builder.append("\nNode ").append(NodeBase.getPath(chosenNode)).append(" [");
          676     }
          677     numOfAvailableNodes--;
          678     DatanodeStorageInfo storage = null;
          679     if (isGoodDatanode(chosenNode, maxNodesPerRack, considerLoad,
          ...
          711   }
          712   // Refresh the node count. If the live node count became smaller,
          713   // but it is not reflected in this loop, it may loop forever in case
          714   // the replicas/rack cannot be satisfied.
          715   if (--refreshCounter == 0) {
          716     refreshCounter = clusterMap.countNumOfAvailableNodes(scope,
          717     excludedNodes);
          718     // It has already gone through enough number of nodes.
          719     if (refreshCounter <= excludedNodes.size()) {
          720       break;
          721     }
          722   }
          723 }
          

          line 672 chooseDataNode(scope) is random, if chosenNode happens to be a excluded one, it won't go to line 674. But refreshCounter is still decreased.
          If we out of luck, too many times of chooseDataNode(scope) return a already excluded one, we go inside line 716, and break at line 720.
          Then we end up with choosing not enough numOfReplicas. In fact we could have.

          Show
          brahmareddy Brahma Reddy Battula added a comment - I don't know if I can give a -1. But shall we revert this? A low of tests are broken because of it. 662 int refreshCounter = numOfAvailableNodes; ... 671 while (numOfReplicas > 0 && numOfAvailableNodes > 0) { 672 DatanodeDescriptor chosenNode = chooseDataNode(scope); 673 if (excludedNodes.add(chosenNode)) { //was not in the excluded list 674 if (LOG.isDebugEnabled()) { 675 builder.append( "\nNode " ).append(NodeBase.getPath(chosenNode)).append( " [" ); 676 } 677 numOfAvailableNodes--; 678 DatanodeStorageInfo storage = null ; 679 if (isGoodDatanode(chosenNode, maxNodesPerRack, considerLoad, ... 711 } 712 // Refresh the node count. If the live node count became smaller, 713 // but it is not reflected in this loop, it may loop forever in case 714 // the replicas/rack cannot be satisfied. 715 if (--refreshCounter == 0) { 716 refreshCounter = clusterMap.countNumOfAvailableNodes(scope, 717 excludedNodes); 718 // It has already gone through enough number of nodes. 719 if (refreshCounter <= excludedNodes.size()) { 720 break ; 721 } 722 } 723 } line 672 chooseDataNode(scope) is random, if chosenNode happens to be a excluded one, it won't go to line 674. But refreshCounter is still decreased. If we out of luck, too many times of chooseDataNode(scope) return a already excluded one, we go inside line 716, and break at line 720. Then we end up with choosing not enough numOfReplicas . In fact we could have.
          Hide
          vinayrpet Vinayakumar B added a comment -

          hi Allen Wittenauer, any idea why tests did not run in last precommit for the patch here
          above comment
          ?

          Show
          vinayrpet Vinayakumar B added a comment - hi Allen Wittenauer , any idea why tests did not run in last precommit for the patch here above comment ?
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #556 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/556/)
          HDFS-4937. ReplicationMonitor can infinite-loop in (kihwal: rev 43539b5ff4ac0874a8a454dc93a2a782b0e0ea8f)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #556 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/556/ ) HDFS-4937 . ReplicationMonitor can infinite-loop in (kihwal: rev 43539b5ff4ac0874a8a454dc93a2a782b0e0ea8f) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #607 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/607/)
          HDFS-4937. ReplicationMonitor can infinite-loop in (kihwal: rev 43539b5ff4ac0874a8a454dc93a2a782b0e0ea8f)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #607 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/607/ ) HDFS-4937 . ReplicationMonitor can infinite-loop in (kihwal: rev 43539b5ff4ac0874a8a454dc93a2a782b0e0ea8f) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #1342 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1342/)
          HDFS-4937. ReplicationMonitor can infinite-loop in (kihwal: rev 43539b5ff4ac0874a8a454dc93a2a782b0e0ea8f)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #1342 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1342/ ) HDFS-4937 . ReplicationMonitor can infinite-loop in (kihwal: rev 43539b5ff4ac0874a8a454dc93a2a782b0e0ea8f) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #619 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/619/)
          HDFS-4937. ReplicationMonitor can infinite-loop in (kihwal: rev 43539b5ff4ac0874a8a454dc93a2a782b0e0ea8f)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #619 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/619/ ) HDFS-4937 . ReplicationMonitor can infinite-loop in (kihwal: rev 43539b5ff4ac0874a8a454dc93a2a782b0e0ea8f) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2493 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2493/)
          HDFS-4937. ReplicationMonitor can infinite-loop in (kihwal: rev 43539b5ff4ac0874a8a454dc93a2a782b0e0ea8f)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2493 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2493/ ) HDFS-4937 . ReplicationMonitor can infinite-loop in (kihwal: rev 43539b5ff4ac0874a8a454dc93a2a782b0e0ea8f) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #2549 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2549/)
          HDFS-4937. ReplicationMonitor can infinite-loop in (kihwal: rev 43539b5ff4ac0874a8a454dc93a2a782b0e0ea8f)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2549 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2549/ ) HDFS-4937 . ReplicationMonitor can infinite-loop in (kihwal: rev 43539b5ff4ac0874a8a454dc93a2a782b0e0ea8f) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          Hide
          kihwal Kihwal Lee added a comment -

          Also committed to branch-2.7.

          Show
          kihwal Kihwal Lee added a comment - Also committed to branch-2.7.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #8730 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8730/)
          HDFS-4937. ReplicationMonitor can infinite-loop in (kihwal: rev 43539b5ff4ac0874a8a454dc93a2a782b0e0ea8f)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8730 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8730/ ) HDFS-4937 . ReplicationMonitor can infinite-loop in (kihwal: rev 43539b5ff4ac0874a8a454dc93a2a782b0e0ea8f) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          kihwal Kihwal Lee added a comment -

          Thanks for the review, Yi Liu. I've committed this to trunk and branch-2.

          Show
          kihwal Kihwal Lee added a comment - Thanks for the review, Yi Liu . I've committed this to trunk and branch-2.
          Hide
          hitliuyi Yi Liu added a comment -

          +1, thanks Kihwal.

          Show
          hitliuyi Yi Liu added a comment - +1, thanks Kihwal.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 6s docker + precommit patch detected.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 mvninstall 3m 0s trunk passed
          +1 compile 4m 21s trunk passed with JDK v1.8.0_60
          +1 compile 4m 4s trunk passed with JDK v1.7.0_79
          +1 checkstyle 0m 58s trunk passed
          +1 mvneclipse 0m 0s trunk passed
          +1 findbugs 0m 0s trunk passed
          +1 javadoc 0m 0s trunk passed
          +1 javadoc 0m 0s trunk passed
          +1 mvninstall 0m 0s the patch passed
          +1 compile 4m 34s the patch passed with JDK v1.8.0_60
          +1 javac 4m 34s the patch passed
          +1 compile 4m 31s the patch passed with JDK v1.7.0_79
          +1 javac 4m 31s the patch passed
          +1 checkstyle 1m 0s the patch passed
          +1 mvneclipse 0m 0s the patch passed
          +1 shellcheck 0m 8s There were no new shellcheck issues.
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 0m 0s the patch passed
          +1 javadoc 0m 0s the patch passed
          +1 javadoc 0m 0s the patch passed
          -1 asflicense 0m 13s Patch generated 1 ASF License warnings.
          23m 45s



          Subsystem Report/Notes
          Docker Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-10-30
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12769642/HDFS-4937.v2.patch
          JIRA Issue HDFS-4937
          Optional Tests asflicense shellcheck javac javadoc mvninstall unit findbugs checkstyle compile
          uname Linux 933890a322fa 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/patchprocess/apache-yetus-e77b1ce/precommit/personality/hadoop.sh
          git revision trunk / e5b1733
          Default Java 1.7.0_79
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_60 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_79
          shellcheck v0.4.1
          JDK v1.7.0_79 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/13285/testReport/
          asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/13285/artifact/patchprocess/patch-asflicense-problems.txt
          modules C: . hadoop-hdfs-project/hadoop-hdfs U: .
          Max memory used 225MB
          Powered by Apache Yetus http://yetus.apache.org
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/13285/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 6s docker + precommit patch detected. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 3m 0s trunk passed +1 compile 4m 21s trunk passed with JDK v1.8.0_60 +1 compile 4m 4s trunk passed with JDK v1.7.0_79 +1 checkstyle 0m 58s trunk passed +1 mvneclipse 0m 0s trunk passed +1 findbugs 0m 0s trunk passed +1 javadoc 0m 0s trunk passed +1 javadoc 0m 0s trunk passed +1 mvninstall 0m 0s the patch passed +1 compile 4m 34s the patch passed with JDK v1.8.0_60 +1 javac 4m 34s the patch passed +1 compile 4m 31s the patch passed with JDK v1.7.0_79 +1 javac 4m 31s the patch passed +1 checkstyle 1m 0s the patch passed +1 mvneclipse 0m 0s the patch passed +1 shellcheck 0m 8s There were no new shellcheck issues. +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 0m 0s the patch passed +1 javadoc 0m 0s the patch passed +1 javadoc 0m 0s the patch passed -1 asflicense 0m 13s Patch generated 1 ASF License warnings. 23m 45s Subsystem Report/Notes Docker Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-10-30 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12769642/HDFS-4937.v2.patch JIRA Issue HDFS-4937 Optional Tests asflicense shellcheck javac javadoc mvninstall unit findbugs checkstyle compile uname Linux 933890a322fa 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/patchprocess/apache-yetus-e77b1ce/precommit/personality/hadoop.sh git revision trunk / e5b1733 Default Java 1.7.0_79 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_60 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_79 shellcheck v0.4.1 JDK v1.7.0_79 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/13285/testReport/ asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/13285/artifact/patchprocess/patch-asflicense-problems.txt modules C: . hadoop-hdfs-project/hadoop-hdfs U: . Max memory used 225MB Powered by Apache Yetus http://yetus.apache.org Console output https://builds.apache.org/job/PreCommit-HDFS-Build/13285/console This message was automatically generated.
          Hide
          kihwal Kihwal Lee added a comment -

          Daryn and I were looking at the patch and realized it can be improved. Attaching a new patch.

          Show
          kihwal Kihwal Lee added a comment - Daryn and I were looking at the patch and realized it can be improved. Attaching a new patch.
          Hide
          kihwal Kihwal Lee added a comment -

          The failed test cases pass when run locally.

          -------------------------------------------------------
           T E S T S
          -------------------------------------------------------
          Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0
          Running org.apache.hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots
          Tests run: 36, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 146.108 sec - in org.apache.hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots
          Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0
          Running org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotBlocksMap
          Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 30.379 sec - in org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotBlocksMap
          Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0
          Running org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer
          Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 12.582 sec - in org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer
          Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0
          Running org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints
          Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 104.369 sec - in org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints
          
          Results :
          
          Tests run: 54, Failures: 0, Errors: 0, Skipped: 0
          

          Also, there actually is no new findbugs issue.

          Show
          kihwal Kihwal Lee added a comment - The failed test cases pass when run locally. ------------------------------------------------------- T E S T S ------------------------------------------------------- Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0 Running org.apache.hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots Tests run: 36, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 146.108 sec - in org.apache.hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0 Running org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotBlocksMap Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 30.379 sec - in org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotBlocksMap Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0 Running org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 12.582 sec - in org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0 Running org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 104.369 sec - in org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints Results : Tests run: 54, Failures: 0, Errors: 0, Skipped: 0 Also, there actually is no new findbugs issue.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          -1 pre-patch 16m 36s Findbugs (version ) appears to be broken on trunk.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 javac 8m 2s There were no new javac warning messages.
          +1 javadoc 10m 39s There were no new javadoc warning messages.
          +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 0m 37s There were no new checkstyle issues.
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 39s mvn install still works.
          +1 eclipse:eclipse 0m 36s The patch built with eclipse:eclipse.
          -1 findbugs 2m 35s The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings.
          +1 native 3m 13s Pre-build of native portion
          -1 hdfs tests 50m 48s Tests failed in hadoop-hdfs.
              95m 11s  



          Reason Tests
          FindBugs module:hadoop-hdfs
          Failed unit tests hadoop.hdfs.server.namenode.snapshot.TestSnapshotBlocksMap
            hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints
            hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots
            hadoop.hdfs.server.namenode.ha.TestEditLogTailer



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12768765/HDFS-4937.v1.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 2f1eb2b
          Findbugs warnings https://builds.apache.org/job/PreCommit-HDFS-Build/13202/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
          hadoop-hdfs test log https://builds.apache.org/job/PreCommit-HDFS-Build/13202/artifact/patchprocess/testrun_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/13202/testReport/
          Java 1.7.0_55
          uname Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/13202/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 pre-patch 16m 36s Findbugs (version ) appears to be broken on trunk. +1 @author 0m 0s The patch does not contain any @author tags. -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac 8m 2s There were no new javac warning messages. +1 javadoc 10m 39s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 37s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 39s mvn install still works. +1 eclipse:eclipse 0m 36s The patch built with eclipse:eclipse. -1 findbugs 2m 35s The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. +1 native 3m 13s Pre-build of native portion -1 hdfs tests 50m 48s Tests failed in hadoop-hdfs.     95m 11s   Reason Tests FindBugs module:hadoop-hdfs Failed unit tests hadoop.hdfs.server.namenode.snapshot.TestSnapshotBlocksMap   hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints   hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots   hadoop.hdfs.server.namenode.ha.TestEditLogTailer Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12768765/HDFS-4937.v1.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 2f1eb2b Findbugs warnings https://builds.apache.org/job/PreCommit-HDFS-Build/13202/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html hadoop-hdfs test log https://builds.apache.org/job/PreCommit-HDFS-Build/13202/artifact/patchprocess/testrun_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/13202/testReport/ Java 1.7.0_55 uname Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-HDFS-Build/13202/console This message was automatically generated.
          Hide
          kihwal Kihwal Lee added a comment -

          Refreshed the patch based on trunk.

          Show
          kihwal Kihwal Lee added a comment - Refreshed the patch based on trunk.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          -1 patch 0m 0s The patch command could not apply the patch during dryrun.



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12595453/HDFS-4937.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / aea26bf
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/13130/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 patch 0m 0s The patch command could not apply the patch during dryrun. Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12595453/HDFS-4937.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / aea26bf Console output https://builds.apache.org/job/PreCommit-HDFS-Build/13130/console This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          -1 patch 0m 0s The patch command could not apply the patch during dryrun.



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12595453/HDFS-4937.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / f1a152c
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/10556/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 patch 0m 0s The patch command could not apply the patch during dryrun. Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12595453/HDFS-4937.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / f1a152c Console output https://builds.apache.org/job/PreCommit-HDFS-Build/10556/console This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          -1 patch 0m 0s The patch command could not apply the patch during dryrun.



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12595453/HDFS-4937.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / f1a152c
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/10547/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 patch 0m 0s The patch command could not apply the patch during dryrun. Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12595453/HDFS-4937.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / f1a152c Console output https://builds.apache.org/job/PreCommit-HDFS-Build/10547/console This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12595453/HDFS-4937.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4754//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4754//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12595453/HDFS-4937.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4754//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4754//console This message is automatically generated.
          Hide
          kihwal Kihwal Lee added a comment -

          Even then it was not able choose at least from them?

          It couldn't pick enough number of nodes because the max replicas/rack was already calculated. I think it worked fine for majority of blocks with 3 replicas since the cluster had more than 3 racks even after refresh. The issue was with blocks with many more replicas. But picking enough nodes is just one condition. The other is for checking the exhaustion of candidate nodes. It would have bailed out of the while loop, if the cached cluster size was updated inside the loop.

          To avoid frequent cluster-size refresh for this rare condition, we can make it update the cached value after dfs.replication.max iterations, within which most blocks should find all they need. If NN hits this issue, it will loop dfs.replication.max times and break out. I prefer this over adding locking, which will slow down normal cases.

          Show
          kihwal Kihwal Lee added a comment - Even then it was not able choose at least from them? It couldn't pick enough number of nodes because the max replicas/rack was already calculated. I think it worked fine for majority of blocks with 3 replicas since the cluster had more than 3 racks even after refresh. The issue was with blocks with many more replicas. But picking enough nodes is just one condition. The other is for checking the exhaustion of candidate nodes. It would have bailed out of the while loop, if the cached cluster size was updated inside the loop. To avoid frequent cluster-size refresh for this rare condition, we can make it update the cached value after dfs.replication.max iterations, within which most blocks should find all they need. If NN hits this issue, it will loop dfs.replication.max times and break out. I prefer this over adding locking, which will slow down normal cases.
          Hide
          umamaheswararao Uma Maheswara Rao G added a comment -

          Hi Kihwal, you said in the comment that operator added large number of new nodes right. Even then it was not able choose at least from them?

          Show
          umamaheswararao Uma Maheswara Rao G added a comment - Hi Kihwal, you said in the comment that operator added large number of new nodes right. Even then it was not able choose at least from them?
          Hide
          kihwal Kihwal Lee added a comment -

          This can mostly be avoided by decommissioning nodes in a smaller batch, which is the recommended practice. But for this particular case, the operator added a large number of new nodes and decommissioned old nodes.

          Show
          kihwal Kihwal Lee added a comment - This can mostly be avoided by decommissioning nodes in a smaller batch, which is the recommended practice. But for this particular case, the operator added a large number of new nodes and decommissioned old nodes.

            People

            • Assignee:
              kihwal Kihwal Lee
              Reporter:
              kihwal Kihwal Lee
            • Votes:
              0 Vote for this issue
              Watchers:
              19 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development