Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-9917

IBR accumulate more objects when SNN was down for sometime.

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.7.2
    • Fix Version/s: 2.8.0, 2.7.3, 3.0.0-alpha1
    • Component/s: None
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      SNN was down for sometime because of some reasons..After restarting SNN,it became unreponsive because

      • 29 DN's sending IBR in each 5 million ( most of them are delete IBRs), where as each datanode had only ~2.5 million blocks.
      • GC can't trigger on this objects since all will be under RPC queue.

      To recover this( to clear this objects) ,restarted all the DN's one by one..This issue happened in 2.4.1 where split of blockreport was not available.

      1. HDFS-9917.patch
        2 kB
        Brahma Reddy Battula
      2. HDFS-9917-branch-2.7.patch
        8 kB
        Brahma Reddy Battula
      3. HDFS-9917-02.patch
        8 kB
        Brahma Reddy Battula
      4. HDFS-9917-branch-2.7-002.patch
        7 kB
        Brahma Reddy Battula

        Activity

        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Closing the JIRA as part of 2.7.3 release.

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Closing the JIRA as part of 2.7.3 release.
        Hide
        szetszwo Tsz Wo Nicholas Sze added a comment -

        Brahma Reddy Battula and Vinayakumar B, thanks a lot for working on this!

        Show
        szetszwo Tsz Wo Nicholas Sze added a comment - Brahma Reddy Battula and Vinayakumar B , thanks a lot for working on this!
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-trunk-Commit #9555 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9555/)
        HDFS-9917. IBR accumulate more objects when SNN was down for sometime. (vinayakumarb: rev 818d6b799eead13a17a0214172df60a269b046fb)

        • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java
        • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/IncrementalBlockReportManager.java
        • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBPOfferService.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #9555 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9555/ ) HDFS-9917 . IBR accumulate more objects when SNN was down for sometime. (vinayakumarb: rev 818d6b799eead13a17a0214172df60a269b046fb) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/IncrementalBlockReportManager.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBPOfferService.java
        Hide
        brahmareddy Brahma Reddy Battula added a comment -

        Vinayakumar B thanks for review and commit and thanks for Tsz Wo Nicholas Sze for additional review.

        Show
        brahmareddy Brahma Reddy Battula added a comment - Vinayakumar B thanks for review and commit and thanks for Tsz Wo Nicholas Sze for additional review.
        Hide
        vinayrpet Vinayakumar B added a comment -

        Committed to trunk, branch-2, branch-2.8 and branch-2.7.

        Thanks Brahma Reddy Battula for the contribution.
        Thanks Tsz Wo Nicholas Sze for reviews.

        Show
        vinayrpet Vinayakumar B added a comment - Committed to trunk, branch-2, branch-2.8 and branch-2.7. Thanks Brahma Reddy Battula for the contribution. Thanks Tsz Wo Nicholas Sze for reviews.
        Hide
        vinayrpet Vinayakumar B added a comment -

        Committing shortly.

        Show
        vinayrpet Vinayakumar B added a comment - Committing shortly.
        Hide
        brahmareddy Brahma Reddy Battula added a comment -

        TestRenameSnapshot,TestDecommissioningStatuspasses and {{TestBalancer }} locally passed.
        TestHFlush is tracked in HDFS-2043...

        Kindly review the patch.

        Show
        brahmareddy Brahma Reddy Battula added a comment - TestRenameSnapshot , TestDecommissioningStatuspasses and {{TestBalancer }} locally passed. TestHFlush is tracked in HDFS-2043 ... Kindly review the patch.
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 20s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
        +1 mvninstall 8m 32s branch-2.7 passed
        +1 compile 1m 15s branch-2.7 passed with JDK v1.8.0_77
        +1 compile 1m 27s branch-2.7 passed with JDK v1.7.0_95
        +1 checkstyle 0m 40s branch-2.7 passed
        +1 mvnsite 1m 29s branch-2.7 passed
        +1 mvneclipse 0m 24s branch-2.7 passed
        -1 findbugs 3m 48s hadoop-hdfs-project/hadoop-hdfs in branch-2.7 has 1 extant Findbugs warnings.
        +1 javadoc 1m 40s branch-2.7 passed with JDK v1.8.0_77
        +1 javadoc 2m 50s branch-2.7 passed with JDK v1.7.0_95
        +1 mvninstall 1m 20s the patch passed
        +1 compile 1m 39s the patch passed with JDK v1.8.0_77
        +1 javac 1m 39s the patch passed
        +1 compile 1m 28s the patch passed with JDK v1.7.0_95
        +1 javac 1m 28s the patch passed
        +1 checkstyle 0m 32s the patch passed
        +1 mvnsite 1m 18s the patch passed
        +1 mvneclipse 0m 11s the patch passed
        -1 whitespace 0m 0s The patch has 2433 line(s) that end in whitespace. Use git apply --whitespace=fix.
        -1 whitespace 0m 51s The patch has 250 line(s) with tabs.
        +1 findbugs 3m 17s the patch passed
        +1 javadoc 1m 17s the patch passed with JDK v1.8.0_77
        +1 javadoc 2m 0s the patch passed with JDK v1.7.0_95
        -1 unit 65m 44s hadoop-hdfs in the patch failed with JDK v1.8.0_77.
        -1 unit 57m 38s hadoop-hdfs in the patch failed with JDK v1.7.0_95.
        -1 asflicense 1m 6s Patch generated 65 ASF License warnings.
        164m 1s



        Reason Tests
        JDK v1.8.0_77 Failed junit tests hadoop.hdfs.web.TestWebHdfsTimeouts
          hadoop.hdfs.server.datanode.TestBlockScanner
          hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots
          hadoop.hdfs.server.blockmanagement.TestBlockManager
          hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
          hadoop.hdfs.server.balancer.TestBalancer
          hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes
          hadoop.hdfs.server.namenode.TestDecommissioningStatus
        JDK v1.7.0_95 Failed junit tests hadoop.hdfs.TestHFlush
          hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots
          hadoop.hdfs.server.balancer.TestBalancer
          hadoop.hdfs.server.namenode.TestDecommissioningStatus
          hadoop.hdfs.server.namenode.TestFileTruncate



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:c420dfe
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12796709/HDFS-9917-branch-2.7-002.patch
        JIRA Issue HDFS-9917
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 03dc641a2ac1 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision branch-2.7 / 353f379
        Default Java 1.7.0_95
        Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_77 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
        findbugs v3.0.0
        findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/15048/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
        whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/15048/artifact/patchprocess/whitespace-eol.txt
        whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/15048/artifact/patchprocess/whitespace-tabs.txt
        unit https://builds.apache.org/job/PreCommit-HDFS-Build/15048/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_77.txt
        unit https://builds.apache.org/job/PreCommit-HDFS-Build/15048/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
        unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15048/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_77.txt https://builds.apache.org/job/PreCommit-HDFS-Build/15048/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
        JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15048/testReport/
        asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/15048/artifact/patchprocess/patch-asflicense-problems.txt
        modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
        Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15048/console
        Powered by Apache Yetus 0.2.0 http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 20s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 8m 32s branch-2.7 passed +1 compile 1m 15s branch-2.7 passed with JDK v1.8.0_77 +1 compile 1m 27s branch-2.7 passed with JDK v1.7.0_95 +1 checkstyle 0m 40s branch-2.7 passed +1 mvnsite 1m 29s branch-2.7 passed +1 mvneclipse 0m 24s branch-2.7 passed -1 findbugs 3m 48s hadoop-hdfs-project/hadoop-hdfs in branch-2.7 has 1 extant Findbugs warnings. +1 javadoc 1m 40s branch-2.7 passed with JDK v1.8.0_77 +1 javadoc 2m 50s branch-2.7 passed with JDK v1.7.0_95 +1 mvninstall 1m 20s the patch passed +1 compile 1m 39s the patch passed with JDK v1.8.0_77 +1 javac 1m 39s the patch passed +1 compile 1m 28s the patch passed with JDK v1.7.0_95 +1 javac 1m 28s the patch passed +1 checkstyle 0m 32s the patch passed +1 mvnsite 1m 18s the patch passed +1 mvneclipse 0m 11s the patch passed -1 whitespace 0m 0s The patch has 2433 line(s) that end in whitespace. Use git apply --whitespace=fix. -1 whitespace 0m 51s The patch has 250 line(s) with tabs. +1 findbugs 3m 17s the patch passed +1 javadoc 1m 17s the patch passed with JDK v1.8.0_77 +1 javadoc 2m 0s the patch passed with JDK v1.7.0_95 -1 unit 65m 44s hadoop-hdfs in the patch failed with JDK v1.8.0_77. -1 unit 57m 38s hadoop-hdfs in the patch failed with JDK v1.7.0_95. -1 asflicense 1m 6s Patch generated 65 ASF License warnings. 164m 1s Reason Tests JDK v1.8.0_77 Failed junit tests hadoop.hdfs.web.TestWebHdfsTimeouts   hadoop.hdfs.server.datanode.TestBlockScanner   hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots   hadoop.hdfs.server.blockmanagement.TestBlockManager   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure   hadoop.hdfs.server.balancer.TestBalancer   hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes   hadoop.hdfs.server.namenode.TestDecommissioningStatus JDK v1.7.0_95 Failed junit tests hadoop.hdfs.TestHFlush   hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots   hadoop.hdfs.server.balancer.TestBalancer   hadoop.hdfs.server.namenode.TestDecommissioningStatus   hadoop.hdfs.server.namenode.TestFileTruncate Subsystem Report/Notes Docker Image:yetus/hadoop:c420dfe JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12796709/HDFS-9917-branch-2.7-002.patch JIRA Issue HDFS-9917 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 03dc641a2ac1 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision branch-2.7 / 353f379 Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_77 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/15048/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/15048/artifact/patchprocess/whitespace-eol.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/15048/artifact/patchprocess/whitespace-tabs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/15048/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_77.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/15048/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15048/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_77.txt https://builds.apache.org/job/PreCommit-HDFS-Build/15048/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15048/testReport/ asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/15048/artifact/patchprocess/patch-asflicense-problems.txt modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15048/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
        Hide
        vinayrpet Vinayakumar B added a comment -

        latest branch-2.7 patch also looks good. +1.
        Will commit this tomorrow, unless there is any objections.
        Thanks Brahma Reddy Battula for updating the patch.

        Show
        vinayrpet Vinayakumar B added a comment - latest branch-2.7 patch also looks good. +1. Will commit this tomorrow, unless there is any objections. Thanks Brahma Reddy Battula for updating the patch.
        Hide
        brahmareddy Brahma Reddy Battula added a comment -

        Uploaded the branch-2.7 patch to address the above comment.. Kindly Review..

        Show
        brahmareddy Brahma Reddy Battula added a comment - Uploaded the branch-2.7 patch to address the above comment.. Kindly Review..
        Hide
        vinayrpet Vinayakumar B added a comment -

        +1 for the trunk patch.

        In branch-2.7 patch, instead of changing the pendingIncrementalBRperStorage accessor to default, its better to add a method to get the size of the pendingIBRs, i.e. getPendingIBRSize(). Similar to one in trunk patch, which was added in IncrementalBlockReportManager.

        Otherwise +1.

        Show
        vinayrpet Vinayakumar B added a comment - +1 for the trunk patch. In branch-2.7 patch, instead of changing the pendingIncrementalBRperStorage accessor to default, its better to add a method to get the size of the pendingIBRs, i.e. getPendingIBRSize() . Similar to one in trunk patch, which was added in IncrementalBlockReportManager. Otherwise +1.
        Hide
        brahmareddy Brahma Reddy Battula added a comment -

        Testcase failure is unrelated ,HDFS-10253 is raised to track this..So Kindly review branch-2.7 and trunk patches..

        Show
        brahmareddy Brahma Reddy Battula added a comment - Testcase failure is unrelated , HDFS-10253 is raised to track this..So Kindly review branch-2.7 and trunk patches..
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 10s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
        +1 mvninstall 6m 26s trunk passed
        +1 compile 0m 37s trunk passed with JDK v1.8.0_74
        +1 compile 0m 40s trunk passed with JDK v1.7.0_95
        +1 checkstyle 0m 21s trunk passed
        +1 mvnsite 0m 50s trunk passed
        +1 mvneclipse 0m 14s trunk passed
        +1 findbugs 1m 53s trunk passed
        +1 javadoc 1m 2s trunk passed with JDK v1.8.0_74
        +1 javadoc 1m 44s trunk passed with JDK v1.7.0_95
        +1 mvninstall 0m 45s the patch passed
        +1 compile 0m 35s the patch passed with JDK v1.8.0_74
        +1 javac 0m 35s the patch passed
        +1 compile 0m 38s the patch passed with JDK v1.7.0_95
        +1 javac 0m 38s the patch passed
        +1 checkstyle 0m 18s the patch passed
        +1 mvnsite 0m 48s the patch passed
        +1 mvneclipse 0m 12s the patch passed
        +1 whitespace 0m 0s Patch has no whitespace issues.
        +1 findbugs 2m 6s the patch passed
        +1 javadoc 1m 3s the patch passed with JDK v1.8.0_74
        +1 javadoc 1m 42s the patch passed with JDK v1.7.0_95
        -1 unit 54m 25s hadoop-hdfs in the patch failed with JDK v1.8.0_74.
        -1 unit 53m 5s hadoop-hdfs in the patch failed with JDK v1.7.0_95.
        +1 asflicense 0m 20s Patch does not generate ASF License warnings.
        131m 53s



        Reason Tests
        JDK v1.8.0_74 Failed junit tests hadoop.TestRefreshCallQueue
        JDK v1.7.0_95 Failed junit tests hadoop.TestRefreshCallQueue
          hadoop.hdfs.TestHFlush



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:fbe3e86
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12796572/HDFS-9917-02.patch
        JIRA Issue HDFS-9917
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 1f7798d9e41b 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 256c82f
        Default Java 1.7.0_95
        Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
        findbugs v3.0.0
        unit https://builds.apache.org/job/PreCommit-HDFS-Build/15041/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt
        unit https://builds.apache.org/job/PreCommit-HDFS-Build/15041/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
        unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15041/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt https://builds.apache.org/job/PreCommit-HDFS-Build/15041/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
        JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15041/testReport/
        modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
        Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15041/console
        Powered by Apache Yetus 0.2.0 http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 10s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 26s trunk passed +1 compile 0m 37s trunk passed with JDK v1.8.0_74 +1 compile 0m 40s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 21s trunk passed +1 mvnsite 0m 50s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 1m 53s trunk passed +1 javadoc 1m 2s trunk passed with JDK v1.8.0_74 +1 javadoc 1m 44s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 45s the patch passed +1 compile 0m 35s the patch passed with JDK v1.8.0_74 +1 javac 0m 35s the patch passed +1 compile 0m 38s the patch passed with JDK v1.7.0_95 +1 javac 0m 38s the patch passed +1 checkstyle 0m 18s the patch passed +1 mvnsite 0m 48s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 2m 6s the patch passed +1 javadoc 1m 3s the patch passed with JDK v1.8.0_74 +1 javadoc 1m 42s the patch passed with JDK v1.7.0_95 -1 unit 54m 25s hadoop-hdfs in the patch failed with JDK v1.8.0_74. -1 unit 53m 5s hadoop-hdfs in the patch failed with JDK v1.7.0_95. +1 asflicense 0m 20s Patch does not generate ASF License warnings. 131m 53s Reason Tests JDK v1.8.0_74 Failed junit tests hadoop.TestRefreshCallQueue JDK v1.7.0_95 Failed junit tests hadoop.TestRefreshCallQueue   hadoop.hdfs.TestHFlush Subsystem Report/Notes Docker Image:yetus/hadoop:fbe3e86 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12796572/HDFS-9917-02.patch JIRA Issue HDFS-9917 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 1f7798d9e41b 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 256c82f Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HDFS-Build/15041/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/15041/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15041/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt https://builds.apache.org/job/PreCommit-HDFS-Build/15041/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15041/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15041/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
        Hide
        brahmareddy Brahma Reddy Battula added a comment -

        Re-uploaded the trunk patch to trigger Jenkins..

        Show
        brahmareddy Brahma Reddy Battula added a comment - Re-uploaded the trunk patch to trigger Jenkins..
        Hide
        brahmareddy Brahma Reddy Battula added a comment -

        Raised HDFS-10245 and HDFS-10248 for Findbugs warnings and ASF License warnings and Test failures are unrelated.

        Show
        brahmareddy Brahma Reddy Battula added a comment - Raised HDFS-10245 and HDFS-10248 for Findbugs warnings and ASF License warnings and Test failures are unrelated.
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 8m 17s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
        +1 mvninstall 7m 32s branch-2.7 passed
        +1 compile 1m 3s branch-2.7 passed with JDK v1.8.0_77
        +1 compile 0m 59s branch-2.7 passed with JDK v1.7.0_95
        +1 checkstyle 0m 26s branch-2.7 passed
        +1 mvnsite 0m 59s branch-2.7 passed
        +1 mvneclipse 0m 17s branch-2.7 passed
        -1 findbugs 3m 2s hadoop-hdfs-project/hadoop-hdfs in branch-2.7 has 1 extant Findbugs warnings.
        +1 javadoc 1m 10s branch-2.7 passed with JDK v1.8.0_77
        +1 javadoc 1m 54s branch-2.7 passed with JDK v1.7.0_95
        +1 mvninstall 0m 52s the patch passed
        +1 compile 0m 54s the patch passed with JDK v1.8.0_77
        +1 javac 0m 54s the patch passed
        +1 compile 0m 57s the patch passed with JDK v1.7.0_95
        +1 javac 0m 57s the patch passed
        -1 checkstyle 0m 21s hadoop-hdfs-project/hadoop-hdfs: patch generated 1 new + 33 unchanged - 0 fixed = 34 total (was 33)
        +1 mvnsite 0m 54s the patch passed
        +1 mvneclipse 0m 11s the patch passed
        -1 whitespace 0m 0s The patch has 1732 line(s) that end in whitespace. Use git apply --whitespace=fix.
        -1 whitespace 0m 44s The patch has 250 line(s) with tabs.
        +1 findbugs 3m 5s the patch passed
        +1 javadoc 1m 5s the patch passed with JDK v1.8.0_77
        +1 javadoc 1m 53s the patch passed with JDK v1.7.0_95
        -1 unit 44m 33s hadoop-hdfs in the patch failed with JDK v1.8.0_77.
        -1 unit 43m 15s hadoop-hdfs in the patch failed with JDK v1.7.0_95.
        -1 asflicense 1m 8s Patch generated 65 ASF License warnings.
        128m 5s



        Reason Tests
        JDK v1.8.0_77 Failed junit tests hadoop.hdfs.TestDatanodeRegistration
          hadoop.tools.TestJMXGet
          hadoop.hdfs.server.balancer.TestBalancer
          hadoop.hdfs.server.datanode.TestDataNodeMetrics
          hadoop.hdfs.TestHFlush
          hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots
        JDK v1.7.0_95 Failed junit tests hadoop.hdfs.server.datanode.TestBlockReplacement
          hadoop.tools.TestJMXGet
          hadoop.hdfs.TestHFlush
          hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:c420dfe
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12796518/HDFS-9917-branch-2.7.patch
        JIRA Issue HDFS-9917
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux ec0eccd34b49 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision branch-2.7 / 856a131
        Default Java 1.7.0_95
        Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_77 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
        findbugs v3.0.0
        findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/15036/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
        checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/15036/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
        whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/15036/artifact/patchprocess/whitespace-eol.txt
        whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/15036/artifact/patchprocess/whitespace-tabs.txt
        unit https://builds.apache.org/job/PreCommit-HDFS-Build/15036/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_77.txt
        unit https://builds.apache.org/job/PreCommit-HDFS-Build/15036/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
        unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15036/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_77.txt https://builds.apache.org/job/PreCommit-HDFS-Build/15036/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
        JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15036/testReport/
        asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/15036/artifact/patchprocess/patch-asflicense-problems.txt
        modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
        Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15036/console
        Powered by Apache Yetus 0.2.0 http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 8m 17s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 7m 32s branch-2.7 passed +1 compile 1m 3s branch-2.7 passed with JDK v1.8.0_77 +1 compile 0m 59s branch-2.7 passed with JDK v1.7.0_95 +1 checkstyle 0m 26s branch-2.7 passed +1 mvnsite 0m 59s branch-2.7 passed +1 mvneclipse 0m 17s branch-2.7 passed -1 findbugs 3m 2s hadoop-hdfs-project/hadoop-hdfs in branch-2.7 has 1 extant Findbugs warnings. +1 javadoc 1m 10s branch-2.7 passed with JDK v1.8.0_77 +1 javadoc 1m 54s branch-2.7 passed with JDK v1.7.0_95 +1 mvninstall 0m 52s the patch passed +1 compile 0m 54s the patch passed with JDK v1.8.0_77 +1 javac 0m 54s the patch passed +1 compile 0m 57s the patch passed with JDK v1.7.0_95 +1 javac 0m 57s the patch passed -1 checkstyle 0m 21s hadoop-hdfs-project/hadoop-hdfs: patch generated 1 new + 33 unchanged - 0 fixed = 34 total (was 33) +1 mvnsite 0m 54s the patch passed +1 mvneclipse 0m 11s the patch passed -1 whitespace 0m 0s The patch has 1732 line(s) that end in whitespace. Use git apply --whitespace=fix. -1 whitespace 0m 44s The patch has 250 line(s) with tabs. +1 findbugs 3m 5s the patch passed +1 javadoc 1m 5s the patch passed with JDK v1.8.0_77 +1 javadoc 1m 53s the patch passed with JDK v1.7.0_95 -1 unit 44m 33s hadoop-hdfs in the patch failed with JDK v1.8.0_77. -1 unit 43m 15s hadoop-hdfs in the patch failed with JDK v1.7.0_95. -1 asflicense 1m 8s Patch generated 65 ASF License warnings. 128m 5s Reason Tests JDK v1.8.0_77 Failed junit tests hadoop.hdfs.TestDatanodeRegistration   hadoop.tools.TestJMXGet   hadoop.hdfs.server.balancer.TestBalancer   hadoop.hdfs.server.datanode.TestDataNodeMetrics   hadoop.hdfs.TestHFlush   hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots JDK v1.7.0_95 Failed junit tests hadoop.hdfs.server.datanode.TestBlockReplacement   hadoop.tools.TestJMXGet   hadoop.hdfs.TestHFlush   hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots Subsystem Report/Notes Docker Image:yetus/hadoop:c420dfe JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12796518/HDFS-9917-branch-2.7.patch JIRA Issue HDFS-9917 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux ec0eccd34b49 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision branch-2.7 / 856a131 Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_77 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/15036/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/15036/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/15036/artifact/patchprocess/whitespace-eol.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/15036/artifact/patchprocess/whitespace-tabs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/15036/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_77.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/15036/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15036/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_77.txt https://builds.apache.org/job/PreCommit-HDFS-Build/15036/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15036/testReport/ asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/15036/artifact/patchprocess/patch-asflicense-problems.txt modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15036/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
        Hide
        brahmareddy Brahma Reddy Battula added a comment -

        Please file a follow up jira for the "Avoid accumulation of IBRs for SNN when the standby is down for more than expected time".

        Raised HDFS-10244.

        Seeing the criticality of this issue, I feel it would be better to land this in 2.7.3 with reRegister() IBR clearance fix.

        Uploaded the patch ..Kindly Review.

        Show
        brahmareddy Brahma Reddy Battula added a comment - Please file a follow up jira for the "Avoid accumulation of IBRs for SNN when the standby is down for more than expected time". Raised HDFS-10244 . Seeing the criticality of this issue, I feel it would be better to land this in 2.7.3 with reRegister() IBR clearance fix. Uploaded the patch ..Kindly Review.
        Hide
        vinayrpet Vinayakumar B added a comment -

        Considering second part of this issue needs more discussion about getting heartBeatExpiryInterval at datanode side, this could be done in a follow up Jira.
        Brahma Reddy Battula, Please file a follow up jira for the "Avoid accumulation of IBRs for SNN when the standby is down for more than expected time".

        Seeing the criticality of this issue, I feel it would be better to land this in 2.7.3 with reRegister() IBR clearance fix.

        Current changes looks good for the fix.
        Please add a Test to verify the same. Mock Tests would be sufficient. TestBPOfferService.java contains similar tests. you can refer them.

        Show
        vinayrpet Vinayakumar B added a comment - Considering second part of this issue needs more discussion about getting heartBeatExpiryInterval at datanode side, this could be done in a follow up Jira. Brahma Reddy Battula , Please file a follow up jira for the "Avoid accumulation of IBRs for SNN when the standby is down for more than expected time". Seeing the criticality of this issue, I feel it would be better to land this in 2.7.3 with reRegister() IBR clearance fix . Current changes looks good for the fix. Please add a Test to verify the same. Mock Tests would be sufficient. TestBPOfferService.java contains similar tests. you can refer them.
        Hide
        vinayrpet Vinayakumar B added a comment -

        Current changes for clearing IBRs on re-Register() looks good.

        For the second part, i.e. Avoid accumulation of IBRs when the standby is down for long time, can we consider as below. (Already mentioned in my above comment)

        1. IBRs for StandbyNN can have a threshold ( say 100K or 1Million IBRs ).
        2. Also not to loose any important IBRs, IBRs can be cleared when "the threshold is reached AND 'lastIBR' is more than 'heartbeatExpiryInterval'. i.e. DataNode is considered dead in Namenode side". In that case, for sure re-Register() will be called on reconnection to running NameNode (if any).

        Only question is, heartBeatExpiryInterval in NameNode depends on conf "dfs.namenode.heartbeat.recheck-interval" which is namenode side configuration. By default this is 5 min. If there is any change in this in Namenode side, that change should also be present in datanode config. Is it okay to use this? or introduce a common conf to NN and DN?

        Tsz Wo Nicholas Sze, what is your opinion in this?

        Show
        vinayrpet Vinayakumar B added a comment - Current changes for clearing IBRs on re-Register() looks good. For the second part, i.e. Avoid accumulation of IBRs when the standby is down for long time, can we consider as below. (Already mentioned in my above comment) 1. IBRs for StandbyNN can have a threshold ( say 100K or 1Million IBRs ). 2. Also not to loose any important IBRs, IBRs can be cleared when "the threshold is reached AND 'lastIBR' is more than 'heartbeatExpiryInterval'. i.e. DataNode is considered dead in Namenode side". In that case, for sure re-Register() will be called on reconnection to running NameNode (if any). Only question is, heartBeatExpiryInterval in NameNode depends on conf "dfs.namenode.heartbeat.recheck-interval" which is namenode side configuration. By default this is 5 min. If there is any change in this in Namenode side, that change should also be present in datanode config. Is it okay to use this? or introduce a common conf to NN and DN? Tsz Wo Nicholas Sze , what is your opinion in this?
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 18s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
        +1 mvninstall 11m 3s trunk passed
        +1 compile 1m 33s trunk passed with JDK v1.8.0_74
        +1 compile 1m 10s trunk passed with JDK v1.7.0_95
        +1 checkstyle 0m 34s trunk passed
        +1 mvnsite 1m 25s trunk passed
        +1 mvneclipse 0m 22s trunk passed
        +1 findbugs 2m 45s trunk passed
        +1 javadoc 1m 56s trunk passed with JDK v1.8.0_74
        +1 javadoc 3m 5s trunk passed with JDK v1.7.0_95
        +1 mvninstall 1m 15s the patch passed
        +1 compile 1m 31s the patch passed with JDK v1.8.0_74
        +1 javac 1m 31s the patch passed
        +1 compile 1m 5s the patch passed with JDK v1.7.0_95
        +1 javac 1m 5s the patch passed
        +1 checkstyle 0m 29s the patch passed
        +1 mvnsite 1m 15s the patch passed
        +1 mvneclipse 0m 18s the patch passed
        +1 whitespace 0m 0s Patch has no whitespace issues.
        +1 findbugs 3m 3s the patch passed
        +1 javadoc 1m 56s the patch passed with JDK v1.8.0_74
        +1 javadoc 2m 59s the patch passed with JDK v1.7.0_95
        -1 unit 80m 45s hadoop-hdfs in the patch failed with JDK v1.8.0_74.
        -1 unit 111m 10s hadoop-hdfs in the patch failed with JDK v1.7.0_95.
        -1 asflicense 0m 37s Patch generated 2 ASF License warnings.
        234m 46s



        Reason Tests
        JDK v1.8.0_74 Failed junit tests hadoop.hdfs.server.namenode.ha.TestHASafeMode
          hadoop.hdfs.server.namenode.TestEditLog
          hadoop.hdfs.server.namenode.ha.TestEditLogTailer
          hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits
          hadoop.hdfs.security.TestDelegationTokenForProxyUser
          hadoop.hdfs.server.namenode.TestNamenodeCapacityReport
          hadoop.hdfs.TestSafeModeWithStripedFile
          hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
        JDK v1.8.0_74 Timed out junit tests org.apache.hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade
          org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyWriter
          org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestSpaceReservation
          org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache
        JDK v1.7.0_95 Failed junit tests hadoop.hdfs.server.datanode.TestDirectoryScanner
          hadoop.hdfs.server.blockmanagement.TestBlockManager
          hadoop.hdfs.qjournal.TestSecureNNWithQJM
          hadoop.hdfs.server.namenode.ha.TestEditLogTailer
          hadoop.hdfs.server.datanode.TestDataNodeUUID
          hadoop.hdfs.security.TestDelegationTokenForProxyUser
          hadoop.hdfs.server.namenode.TestFSImageWithSnapshot
          hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA
          hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl
          hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:fbe3e86
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12794672/HDFS-9917.patch
        JIRA Issue HDFS-9917
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 79ae4a0244b9 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / e7ed05e
        Default Java 1.7.0_95
        Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
        findbugs v3.0.0
        unit https://builds.apache.org/job/PreCommit-HDFS-Build/14890/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt
        unit https://builds.apache.org/job/PreCommit-HDFS-Build/14890/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
        unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/14890/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt https://builds.apache.org/job/PreCommit-HDFS-Build/14890/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
        JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/14890/testReport/
        asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/14890/artifact/patchprocess/patch-asflicense-problems.txt
        modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
        Console output https://builds.apache.org/job/PreCommit-HDFS-Build/14890/console
        Powered by Apache Yetus 0.2.0 http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 18s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 11m 3s trunk passed +1 compile 1m 33s trunk passed with JDK v1.8.0_74 +1 compile 1m 10s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 34s trunk passed +1 mvnsite 1m 25s trunk passed +1 mvneclipse 0m 22s trunk passed +1 findbugs 2m 45s trunk passed +1 javadoc 1m 56s trunk passed with JDK v1.8.0_74 +1 javadoc 3m 5s trunk passed with JDK v1.7.0_95 +1 mvninstall 1m 15s the patch passed +1 compile 1m 31s the patch passed with JDK v1.8.0_74 +1 javac 1m 31s the patch passed +1 compile 1m 5s the patch passed with JDK v1.7.0_95 +1 javac 1m 5s the patch passed +1 checkstyle 0m 29s the patch passed +1 mvnsite 1m 15s the patch passed +1 mvneclipse 0m 18s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 3m 3s the patch passed +1 javadoc 1m 56s the patch passed with JDK v1.8.0_74 +1 javadoc 2m 59s the patch passed with JDK v1.7.0_95 -1 unit 80m 45s hadoop-hdfs in the patch failed with JDK v1.8.0_74. -1 unit 111m 10s hadoop-hdfs in the patch failed with JDK v1.7.0_95. -1 asflicense 0m 37s Patch generated 2 ASF License warnings. 234m 46s Reason Tests JDK v1.8.0_74 Failed junit tests hadoop.hdfs.server.namenode.ha.TestHASafeMode   hadoop.hdfs.server.namenode.TestEditLog   hadoop.hdfs.server.namenode.ha.TestEditLogTailer   hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits   hadoop.hdfs.security.TestDelegationTokenForProxyUser   hadoop.hdfs.server.namenode.TestNamenodeCapacityReport   hadoop.hdfs.TestSafeModeWithStripedFile   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure JDK v1.8.0_74 Timed out junit tests org.apache.hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade   org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyWriter   org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestSpaceReservation   org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache JDK v1.7.0_95 Failed junit tests hadoop.hdfs.server.datanode.TestDirectoryScanner   hadoop.hdfs.server.blockmanagement.TestBlockManager   hadoop.hdfs.qjournal.TestSecureNNWithQJM   hadoop.hdfs.server.namenode.ha.TestEditLogTailer   hadoop.hdfs.server.datanode.TestDataNodeUUID   hadoop.hdfs.security.TestDelegationTokenForProxyUser   hadoop.hdfs.server.namenode.TestFSImageWithSnapshot   hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA   hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure Subsystem Report/Notes Docker Image:yetus/hadoop:fbe3e86 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12794672/HDFS-9917.patch JIRA Issue HDFS-9917 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 79ae4a0244b9 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / e7ed05e Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HDFS-Build/14890/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/14890/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/14890/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt https://builds.apache.org/job/PreCommit-HDFS-Build/14890/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/14890/testReport/ asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/14890/artifact/patchprocess/patch-asflicense-problems.txt modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/14890/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
        Hide
        brahmareddy Brahma Reddy Battula added a comment -

        Uploaded patch..Kindly review.. can we limit the number of IBR's to standby where DN keep accumulating the IBRs and use lot of memory..?

        Show
        brahmareddy Brahma Reddy Battula added a comment - Uploaded patch..Kindly review.. can we limit the number of IBR's to standby where DN keep accumulating the IBRs and use lot of memory..?
        Hide
        szetszwo Tsz Wo Nicholas Sze added a comment -

        Brahma Reddy Battula, your proposal on reRegister() sounds great, thanks.

        Show
        szetszwo Tsz Wo Nicholas Sze added a comment - Brahma Reddy Battula , your proposal on reRegister() sounds great, thanks.
        Hide
        brahmareddy Brahma Reddy Battula added a comment -
        Show
        brahmareddy Brahma Reddy Battula added a comment - Ping Tsz Wo Nicholas Sze .
        Hide
        brahmareddy Brahma Reddy Battula added a comment -

        I meant to say,we can avoid RPC to namenode and unnecessary GC for these IBR's..

        Show
        brahmareddy Brahma Reddy Battula added a comment - I meant to say,we can avoid RPC to namenode and unnecessary GC for these IBR's..
        Hide
        brahmareddy Brahma Reddy Battula added a comment -

        I suggest that NN could just ignore the pending IBRs before the first full BR. Would it fix the problem?

        Yes, I think its same as clearing on reRegister() at datanode itself.
        Advantage of clearing on reRegister() in DN itself, is
        unnecessary RPC will go to namenode and Namenode need to unnecessary GC for these IBR's..

        We may also need to limit the DN keep accumulating the IBRs and use lot of memory

        Show
        brahmareddy Brahma Reddy Battula added a comment - I suggest that NN could just ignore the pending IBRs before the first full BR. Would it fix the problem? Yes, I think its same as clearing on reRegister() at datanode itself. Advantage of clearing on reRegister() in DN itself, is unnecessary RPC will go to namenode and Namenode need to unnecessary GC for these IBR's.. We may also need to limit the DN keep accumulating the IBRs and use lot of memory
        Hide
        szetszwo Tsz Wo Nicholas Sze added a comment -

        > Before Full BR, all pending IBRs will be flushed. ...

        Yes, this is the current problem. I suggest that NN could just ignore the pending IBRs before the first full BR. Would it fix the problem?

        Show
        szetszwo Tsz Wo Nicholas Sze added a comment - > Before Full BR, all pending IBRs will be flushed. ... Yes, this is the current problem. I suggest that NN could just ignore the pending IBRs before the first full BR. Would it fix the problem?
        Hide
        brahmareddy Brahma Reddy Battula added a comment -

        Before Full BR, all pending IBRs will be flushed.
        In current problem case, size of IBR itself is huge than FBR,IBR itself failed. because NN was not able to process it completely. thats why it kept accumulating.

        Show
        brahmareddy Brahma Reddy Battula added a comment - Before Full BR, all pending IBRs will be flushed. In current problem case, size of IBR itself is huge than FBR,IBR itself failed. because NN was not able to process it completely. thats why it kept accumulating.
        Hide
        szetszwo Tsz Wo Nicholas Sze added a comment -

        When SNN is restarted, DNs send a full BR to it. Then, the IBRs collected before the full BR can be dropped. Is it the case?

        Show
        szetszwo Tsz Wo Nicholas Sze added a comment - When SNN is restarted, DNs send a full BR to it. Then, the IBRs collected before the full BR can be dropped. Is it the case?
        Hide
        vinayrpet Vinayakumar B added a comment -

        Clear the IBRS on re-register to namenode.

        I think this is fine. This is only one part of the solution to make SNN start successfully.

        Also its required to limit the number of IBRs for Standby.

        1. May be IBRs for StandbyNN can have a threshold ( say 100K or 1Million IBRs ).
        2. Also not to loose any important IBRs, IBRs can be cleared when "the threshold is reached AND 'lastIBR' is more than 'heartbeatExpiryInterval'. i.e. DataNode is considered dead in Namenode side".

        Tsz Wo Nicholas Sze/Jing Zhao, does this make sense to you?

        Show
        vinayrpet Vinayakumar B added a comment - Clear the IBRS on re-register to namenode. I think this is fine. This is only one part of the solution to make SNN start successfully. Also its required to limit the number of IBRs for Standby. 1. May be IBRs for StandbyNN can have a threshold ( say 100K or 1Million IBRs ). 2. Also not to loose any important IBRs, IBRs can be cleared when "the threshold is reached AND 'lastIBR' is more than 'heartbeatExpiryInterval'. i.e. DataNode is considered dead in Namenode side". Tsz Wo Nicholas Sze / Jing Zhao , does this make sense to you?
        Hide
        brahmareddy Brahma Reddy Battula added a comment -

        As current intention is not overload the NN. Planning to fix like following

        • Clear the IBRS on re-register to namenode.
        void reRegister() throws IOException {
            if (shouldRun()) {
              // re-retrieve namespace info to make sure that, if the NN
              // was restarted, we still match its version (HDFS-2120)
              NamespaceInfo nsInfo = retrieveNamespaceInfo();
              // and re-register
              register(nsInfo);
              scheduler.scheduleHeartbeat();
              //HDFS-9917,Standby NN IBR can be very huge if standby namenode is down
              // for sometime.
              if (state == HAServiceState.STANDBY) {
                ibrManager.clearIBRs();
              }
            }
          }
        

        Any thoughts on this..?

        Show
        brahmareddy Brahma Reddy Battula added a comment - As current intention is not overload the NN . Planning to fix like following Clear the IBRS on re-register to namenode. void reRegister() throws IOException { if (shouldRun()) { // re-retrieve namespace info to make sure that, if the NN // was restarted, we still match its version (HDFS-2120) NamespaceInfo nsInfo = retrieveNamespaceInfo(); // and re-register register(nsInfo); scheduler.scheduleHeartbeat(); //HDFS-9917,Standby NN IBR can be very huge if standby namenode is down // for sometime. if (state == HAServiceState.STANDBY) { ibrManager.clearIBRs(); } } } Any thoughts on this..?
        Hide
        brahmareddy Brahma Reddy Battula added a comment -

        Following tested on trunk,not the original cluster data.

        After stopping the SNN:
        =======================
        
        BLR1000006554:/opt/Trunk/hadoop/bin # jmap -histo:live 34458 | tee dnheap.log | grep -i org.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo
         19:          2801          67224  org.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo
         852:             3             72  org.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo$BlockStatus
        1234:             1             32  [Lorg.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo$BlockStatus;
        
        After 10 mins where I am just writing the files:
        ================================================
        
        BLR1000006554:/opt/Trunk/hadoop/bin # jmap -histo:live 34458 | tee dnheap.log | grep -i org.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo
           5:         73957        1774968  org.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo
         852:             3             72  org.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo$BlockStatus
        1234:             1             32  [Lorg.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo$BlockStatus;
        
        After Restart of SNN:
        ====================
        BLR1000006554:/opt/Trunk/hadoop/bin # jmap -histo:live 34458 | tee dnheap.log | grep -i org.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo
         848:             3             72  org.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo$BlockStatus
        1237:             1             32  [Lorg.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo$BlockStatus;
        
        Show
        brahmareddy Brahma Reddy Battula added a comment - Following tested on trunk,not the original cluster data. After stopping the SNN: ======================= BLR1000006554:/opt/Trunk/hadoop/bin # jmap -histo:live 34458 | tee dnheap.log | grep -i org.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo 19: 2801 67224 org.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo 852: 3 72 org.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo$BlockStatus 1234: 1 32 [Lorg.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo$BlockStatus; After 10 mins where I am just writing the files: ================================================ BLR1000006554:/opt/Trunk/hadoop/bin # jmap -histo:live 34458 | tee dnheap.log | grep -i org.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo 5: 73957 1774968 org.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo 852: 3 72 org.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo$BlockStatus 1234: 1 32 [Lorg.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo$BlockStatus; After Restart of SNN: ==================== BLR1000006554:/opt/Trunk/hadoop/bin # jmap -histo:live 34458 | tee dnheap.log | grep -i org.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo 848: 3 72 org.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo$BlockStatus 1237: 1 32 [Lorg.apache.hadoop.hdfs.server.protocol.ReceivedDeletedBlockInfo$BlockStatus;

          People

          • Assignee:
            brahmareddy Brahma Reddy Battula
            Reporter:
            brahmareddy Brahma Reddy Battula
          • Votes:
            0 Vote for this issue
            Watchers:
            13 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development