Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: HA branch (HDFS-1623)
    • Fix Version/s: HA branch (HDFS-1623)
    • Component/s: ha, namenode
    • Labels:
      None

      Description

      This JIRA is to address a TODO in NameNode about actually implementing the checkHealth RPC.

      1. HDFS-3027-HDFS-1623.patch
        8 kB
        Aaron T. Myers
      2. HDFS-3027-HDFS-1623.patch
        8 kB
        Aaron T. Myers
      3. HDFS-3027-HDFS-1623.patch
        5 kB
        Aaron T. Myers

        Issue Links

          Activity

          Aaron T. Myers created issue -
          Hide
          Aaron T. Myers added a comment -

          Here's a patch which addresses the issue. Over on HDFS-2920 (where this patch was originally posted) Eli had the following feedback:

          !nameNodeHasResourcesAvailable implies "The NameNode has run out of resources" instead of the "NameNode is low on resources". It would be even better if the message was more specific (eg mentioned lack of inodes of disk space)

          Yea, agree, but to do that would be a more invasive change, and more fundamental to to the way the NN currently checks resources, to return a status and a reason. Might if I punt that to another JIRA?

          Show
          Aaron T. Myers added a comment - Here's a patch which addresses the issue. Over on HDFS-2920 (where this patch was originally posted) Eli had the following feedback: !nameNodeHasResourcesAvailable implies "The NameNode has run out of resources" instead of the "NameNode is low on resources". It would be even better if the message was more specific (eg mentioned lack of inodes of disk space) Yea, agree, but to do that would be a more invasive change, and more fundamental to to the way the NN currently checks resources, to return a status and a reason. Might if I punt that to another JIRA?
          Aaron T. Myers made changes -
          Field Original Value New Value
          Attachment HDFS-3027-HDFS-1623.patch [ 12516460 ]
          Hide
          Eli Collins added a comment -

          Yea, feel free to punt the specific message to another jira. For now we should at least say "The NameNode has no resources available" to match the code, and since we're failing to run (vs being low which shouldn't cause us to necessarily fail the health check).

          Also, think you left out the test change from the patch?

          Show
          Eli Collins added a comment - Yea, feel free to punt the specific message to another jira. For now we should at least say "The NameNode has no resources available" to match the code, and since we're failing to run (vs being low which shouldn't cause us to necessarily fail the health check). Also, think you left out the test change from the patch?
          Hide
          Aaron T. Myers added a comment -

          Thanks a lot for the review, Eli. My bad on forgetting the to include the test. Here's an updated patch.

          Show
          Aaron T. Myers added a comment - Thanks a lot for the review, Eli. My bad on forgetting the to include the test. Here's an updated patch.
          Aaron T. Myers made changes -
          Attachment HDFS-3027-HDFS-1623.patch [ 12516485 ]
          Hide
          Eli Collins added a comment -

          Looks like NNResourceChecker#hasAvailableDiskSpace doesn't actually throw IOE so checkAvailableResources doesn't need to either, so you can remove the new catch of IOE case in NN#healthCheck. Otherwise looks great.

          Show
          Eli Collins added a comment - Looks like NNResourceChecker#hasAvailableDiskSpace doesn't actually throw IOE so checkAvailableResources doesn't need to either, so you can remove the new catch of IOE case in NN#healthCheck. Otherwise looks great.
          Hide
          Aaron T. Myers added a comment -

          Good catch re: unnecessarily-declared IOEs. Here's an updated patch which takes care of that.

          Show
          Aaron T. Myers added a comment - Good catch re: unnecessarily-declared IOEs. Here's an updated patch which takes care of that.
          Aaron T. Myers made changes -
          Attachment HDFS-3027-HDFS-1623.patch [ 12516610 ]
          Hide
          Eli Collins added a comment -

          +1

          Show
          Eli Collins added a comment - +1
          Hide
          Aaron T. Myers added a comment -

          Thanks a lot for the reviews, Eli. I've just committed this to the HA branch.

          Show
          Aaron T. Myers added a comment - Thanks a lot for the reviews, Eli. I've just committed this to the HA branch.
          Aaron T. Myers made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Hadoop Flags Reviewed [ 10343 ]
          Fix Version/s HA branch (HDFS-1623) [ 12317568 ]
          Resolution Fixed [ 1 ]
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-HAbranch-build #93 (See https://builds.apache.org/job/Hadoop-Hdfs-HAbranch-build/93/)
          HDFS-3027. Implement a simple NN health check. Contributed by Aaron T. Myers. (Revision 1295300)

          Result = UNSTABLE
          atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1295300
          Files :

          • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-1623.txt
          • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
          • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeResourceChecker.java
          • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java
          • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestNNHealthCheck.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-HAbranch-build #93 (See https://builds.apache.org/job/Hadoop-Hdfs-HAbranch-build/93/ ) HDFS-3027 . Implement a simple NN health check. Contributed by Aaron T. Myers. (Revision 1295300) Result = UNSTABLE atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1295300 Files : /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/CHANGES. HDFS-1623 .txt /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeResourceChecker.java /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestNNHealthCheck.java
          Eli Collins made changes -
          Link This issue relates to HDFS-2704 [ HDFS-2704 ]
          Eli Collins made changes -
          Link This issue relates to HDFS-3090 [ HDFS-3090 ]

            People

            • Assignee:
              Aaron T. Myers
              Reporter:
              Aaron T. Myers
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development