Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-11224

Lifeline message should be ignored for dead nodes

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 3.0.0-alpha2
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Lifeline messages should be ignored for dead nodes in NameNode.
      Otherwise, cluster level stats such as capacity, used, etc will be doubled, after re-registration of node.

      1. HDFS-11224-01.patch
        7 kB
        Vinayakumar B
      2. HDFS-11224-02.patch
        8 kB
        Vinayakumar B

        Activity

        Hide
        vinayrpet Vinayakumar B added a comment -

        Attached the patch.
        Please review.

        Show
        vinayrpet Vinayakumar B added a comment - Attached the patch. Please review.
        Hide
        brahmareddy Brahma Reddy Battula added a comment -

        Vinayakumar B Nice Catch...

        Patch LGTM, apart from following minor nit.

        can we update the following comment also..?

              // This is null if the DataNode has not yet registered.  We expect this
              // will never happen, because the DataNode has logic to prevent sending
              // lifeline messages until after initial registration is successful.
              // Lifeline message handling can't send commands back to the DataNode to
              // tell it to register, so simply exit.
        
        Show
        brahmareddy Brahma Reddy Battula added a comment - Vinayakumar B Nice Catch... Patch LGTM, apart from following minor nit. can we update the following comment also..? // This is null if the DataNode has not yet registered. We expect this // will never happen, because the DataNode has logic to prevent sending // lifeline messages until after initial registration is successful. // Lifeline message handling can't send commands back to the DataNode to // tell it to register, so simply exit.
        Hide
        vinayrpet Vinayakumar B added a comment -

        Updated comment.

        Show
        vinayrpet Vinayakumar B added a comment - Updated comment.
        Hide
        hadoopqa Hadoop QA added a comment -
        +1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 14s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
        +1 mvninstall 8m 6s trunk passed
        +1 compile 0m 50s trunk passed
        +1 checkstyle 0m 30s trunk passed
        +1 mvnsite 0m 57s trunk passed
        +1 mvneclipse 0m 14s trunk passed
        +1 findbugs 1m 54s trunk passed
        +1 javadoc 0m 41s trunk passed
        +1 mvninstall 0m 50s the patch passed
        +1 compile 0m 47s the patch passed
        +1 javac 0m 47s the patch passed
        -0 checkstyle 0m 27s hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 115 unchanged - 0 fixed = 116 total (was 115)
        +1 mvnsite 0m 53s the patch passed
        +1 mvneclipse 0m 12s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 1m 58s the patch passed
        +1 javadoc 0m 39s the patch passed
        +1 unit 71m 49s hadoop-hdfs in the patch passed.
        +1 asflicense 0m 21s The patch does not generate ASF License warnings.
        92m 42s



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:a9ad5d6
        JIRA Issue HDFS-11224
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12842315/HDFS-11224-01.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 020478141ad4 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 9ef89ed
        Default Java 1.8.0_111
        findbugs v3.0.0
        checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17797/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
        Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17797/testReport/
        modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
        Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17797/console
        Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 8m 6s trunk passed +1 compile 0m 50s trunk passed +1 checkstyle 0m 30s trunk passed +1 mvnsite 0m 57s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 1m 54s trunk passed +1 javadoc 0m 41s trunk passed +1 mvninstall 0m 50s the patch passed +1 compile 0m 47s the patch passed +1 javac 0m 47s the patch passed -0 checkstyle 0m 27s hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 115 unchanged - 0 fixed = 116 total (was 115) +1 mvnsite 0m 53s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 58s the patch passed +1 javadoc 0m 39s the patch passed +1 unit 71m 49s hadoop-hdfs in the patch passed. +1 asflicense 0m 21s The patch does not generate ASF License warnings. 92m 42s Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-11224 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12842315/HDFS-11224-01.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 020478141ad4 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 9ef89ed Default Java 1.8.0_111 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17797/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17797/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17797/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        hadoopqa Hadoop QA added a comment -
        +1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 12s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
        +1 mvninstall 6m 55s trunk passed
        +1 compile 0m 44s trunk passed
        +1 checkstyle 0m 26s trunk passed
        +1 mvnsite 0m 52s trunk passed
        +1 mvneclipse 0m 12s trunk passed
        +1 findbugs 1m 41s trunk passed
        +1 javadoc 0m 39s trunk passed
        +1 mvninstall 0m 44s the patch passed
        +1 compile 0m 42s the patch passed
        +1 javac 0m 42s the patch passed
        -0 checkstyle 0m 24s hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 115 unchanged - 0 fixed = 116 total (was 115)
        +1 mvnsite 0m 49s the patch passed
        +1 mvneclipse 0m 10s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 1m 46s the patch passed
        +1 javadoc 0m 37s the patch passed
        +1 unit 63m 30s hadoop-hdfs in the patch passed.
        +1 asflicense 0m 19s The patch does not generate ASF License warnings.
        81m 50s



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:a9ad5d6
        JIRA Issue HDFS-11224
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12842321/HDFS-11224-02.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux c31c126ec000 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 9ef89ed
        Default Java 1.8.0_111
        findbugs v3.0.0
        checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17798/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
        Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17798/testReport/
        modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
        Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17798/console
        Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 12s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 6m 55s trunk passed +1 compile 0m 44s trunk passed +1 checkstyle 0m 26s trunk passed +1 mvnsite 0m 52s trunk passed +1 mvneclipse 0m 12s trunk passed +1 findbugs 1m 41s trunk passed +1 javadoc 0m 39s trunk passed +1 mvninstall 0m 44s the patch passed +1 compile 0m 42s the patch passed +1 javac 0m 42s the patch passed -0 checkstyle 0m 24s hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 115 unchanged - 0 fixed = 116 total (was 115) +1 mvnsite 0m 49s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 46s the patch passed +1 javadoc 0m 37s the patch passed +1 unit 63m 30s hadoop-hdfs in the patch passed. +1 asflicense 0m 19s The patch does not generate ASF License warnings. 81m 50s Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-11224 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12842321/HDFS-11224-02.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux c31c126ec000 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 9ef89ed Default Java 1.8.0_111 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17798/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17798/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17798/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        anu Anu Engineer added a comment -

        Vinayakumar B Thanks for catching this and providing a patch. The patch looks very good. I wanted to flag a related race condition that is reported in HDFS-9911. Please look at the last comment – Which explains how to simulate that issue. If this patch fixes that issue too, would it make sense to add it as part of test cases ? if not, do you think it makes sense to fix it via this JIRA ?

        Show
        anu Anu Engineer added a comment - Vinayakumar B Thanks for catching this and providing a patch. The patch looks very good. I wanted to flag a related race condition that is reported in HDFS-9911 . Please look at the last comment – Which explains how to simulate that issue. If this patch fixes that issue too, would it make sense to add it as part of test cases ? if not, do you think it makes sense to fix it via this JIRA ?
        Hide
        anu Anu Engineer added a comment -

        Btw, I am +1 on committing this.

        Show
        anu Anu Engineer added a comment - Btw, I am +1 on committing this.
        Hide
        vinayrpet Vinayakumar B added a comment -

        Thanks Anu Engineer for review.
        HDFS-9911, is not related to this fix. Current fix just avoids the effect of RPC, but cannot avoid the RPC itself, which increments the Metric value asserted in HDFS-9911's failure.
        So, I commented the possible reason of failure in HDFS-9911 itself. Please check.
        I will commit this Jira shortly.

        Show
        vinayrpet Vinayakumar B added a comment - Thanks Anu Engineer for review. HDFS-9911 , is not related to this fix. Current fix just avoids the effect of RPC, but cannot avoid the RPC itself, which increments the Metric value asserted in HDFS-9911 's failure. So, I commented the possible reason of failure in HDFS-9911 itself. Please check. I will commit this Jira shortly.
        Hide
        vinayrpet Vinayakumar B added a comment -

        Committed to trunk, branch-2 and branch-2.8
        Thanks Brahma Reddy Battula and Anu Engineer for reviews.

        Show
        vinayrpet Vinayakumar B added a comment - Committed to trunk, branch-2 and branch-2.8 Thanks Brahma Reddy Battula and Anu Engineer for reviews.
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10975 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10975/)
        HDFS-11224. Lifeline message should be ignored for dead nodes (vinayakumarb: rev d1d4aba71b21871140b162583a4b94ce118e1fb3)

        • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java
        • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
        • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeLifeline.java
        • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java
        • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManagerTestUtil.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10975 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10975/ ) HDFS-11224 . Lifeline message should be ignored for dead nodes (vinayakumarb: rev d1d4aba71b21871140b162583a4b94ce118e1fb3) (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeLifeline.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManagerTestUtil.java

          People

          • Assignee:
            vinayrpet Vinayakumar B
            Reporter:
            vinayrpet Vinayakumar B
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development