Hadoop Common
  1. Hadoop Common
  2. HADOOP-6977

Herriot daemon clients should vend statistics

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.22.0
    • Fix Version/s: 0.22.0
    • Component/s: test
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Tags:
      herriot

      Description

      The HDFS web user interface serves useful information through dfshealth.jsp and dfsnodelist.jsp.

      The Herriot interface to Hadoop cluster daemons would benefit from the addition of some way to channel metics information.

      1. FinderTest.java
        0.9 kB
        Al Thompson
      2. HADOOP-6977.patch
        11 kB
        Konstantin Boudnik
      3. HADOOP-6977.patch
        11 kB
        Konstantin Boudnik
      4. HADOOP-6977.patch
        11 kB
        Konstantin Boudnik
      5. HADOOP-6977.patch
        9 kB
        Konstantin Boudnik
      6. HADOOP-6977.patch
        9 kB
        Konstantin Boudnik
      7. HADOOP-6977.patch
        9 kB
        Konstantin Boudnik
      8. HADOOP-6977.y20S.patch
        17 kB
        Konstantin Boudnik
      9. HADOOP-6977.y20S.patch
        15 kB
        Konstantin Boudnik

        Issue Links

          Activity

          Konstantin Shvachko made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk #509 (See https://hudson.apache.org/hudson/job/Hadoop-Common-trunk/509/)
          HADOOP-6977. Herriot daemon clients should vend statistics. Contributed by Konstantin Boudnik

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk #509 (See https://hudson.apache.org/hudson/job/Hadoop-Common-trunk/509/ ) HADOOP-6977 . Herriot daemon clients should vend statistics. Contributed by Konstantin Boudnik
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #420 (See https://hudson.apache.org/hudson/job/Hadoop-Common-trunk-Commit/420/)
          HADOOP-6977. Herriot daemon clients should vend statistics. Contributed by Konstantin Boudnik

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #420 (See https://hudson.apache.org/hudson/job/Hadoop-Common-trunk-Commit/420/ ) HADOOP-6977 . Herriot daemon clients should vend statistics. Contributed by Konstantin Boudnik
          Konstantin Boudnik made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Fix Version/s 0.22.0 [ 12314296 ]
          Resolution Fixed [ 1 ]
          Hide
          Konstantin Boudnik added a comment -

          I have just committed it to trunk.

          Show
          Konstantin Boudnik added a comment - I have just committed it to trunk.
          Hide
          Konstantin Boudnik added a comment -

          In order for this functionality to work Hadoop daemons needs to have two system properties set:

          • -Dcom.sun.management.jmxremote (to enable JMX)
          • -Dcom.sun.management.jmxremote.port (to set the static port for JMX connections)
          Show
          Konstantin Boudnik added a comment - In order for this functionality to work Hadoop daemons needs to have two system properties set: -Dcom.sun.management.jmxremote (to enable JMX) -Dcom.sun.management.jmxremote.port (to set the static port for JMX connections)
          Hide
          Konstantin Boudnik added a comment -

          I'm going to commit this by end of today. Now, this patch will cause a brief break in HDFS and MR parts of Herriot, but it is nothing to worry about as the upstream patches will be committed right away.

          Show
          Konstantin Boudnik added a comment - I'm going to commit this by end of today. Now, this patch will cause a brief break in HDFS and MR parts of Herriot, but it is nothing to worry about as the upstream patches will be committed right away.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12459228/HADOOP-6977.patch
          against trunk revision 1032730.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/87//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/87//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/87//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12459228/HADOOP-6977.patch against trunk revision 1032730. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/87//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/87//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/87//console This message is automatically generated.
          Konstantin Boudnik made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Konstantin Boudnik made changes -
          Attachment HADOOP-6977.patch [ 12459228 ]
          Hide
          Konstantin Boudnik added a comment -

          It turns out that I was too quick to make [https://issues.apache.org/jira/browse/HADOOP-6977?focusedCommentId=12917775&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12917775|optimisation offered here]. It drew the patch to be dysfunctional. Removing the optimisation.

          I have ran JMX test from HDFS using current version of the patch: all work. I'm going to rerun patch validation once for just in case.

          Show
          Konstantin Boudnik added a comment - It turns out that I was too quick to make [https://issues.apache.org/jira/browse/HADOOP-6977?focusedCommentId=12917775&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12917775|optimisation offered here] . It drew the patch to be dysfunctional. Removing the optimisation. I have ran JMX test from HDFS using current version of the patch: all work. I'm going to rerun patch validation once for just in case.
          Hide
          Boris Shkolnik added a comment -

          +1 for latest change

          Show
          Boris Shkolnik added a comment - +1 for latest change
          Konstantin Boudnik made changes -
          Attachment HADOOP-6977.patch [ 12458299 ]
          Hide
          Konstantin Boudnik added a comment -

          Addressing good finding Boris made at HDFS-1408.

          Show
          Konstantin Boudnik added a comment - Addressing good finding Boris made at HDFS-1408 .
          Hide
          Konstantin Boudnik added a comment -

          Running test-patch on my laptop.

               [exec] -1 overall.  
               [exec] 
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec] 
               [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
               [exec] 
               [exec]     -1 javadoc.  The javadoc tool appears to have generated 1 warning messages.
               [exec] 
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec] 
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
               [exec] 
               [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
               [exec] 
               [exec]     +1 system tests framework.  The patch passed system tests framework compile.
          

          Warning seems to be coming from security related code as in last N test-patch runs.

          Show
          Konstantin Boudnik added a comment - Running test-patch on my laptop. [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] -1 javadoc. The javadoc tool appears to have generated 1 warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 system tests framework. The patch passed system tests framework compile. Warning seems to be coming from security related code as in last N test-patch runs.
          Hide
          Konstantin Boudnik added a comment -

          Thanks for the review, Tanping. That's exactly the idea. If you take a look at the latest HDFS-1408 it implements this approach.

          Show
          Konstantin Boudnik added a comment - Thanks for the review, Tanping. That's exactly the idea. If you take a look at the latest HDFS-1408 it implements this approach.
          Hide
          Tanping Wang added a comment -

          In the newer patch, "HADOOP_OPTS" is passed in as parameter to check isJmxEnabled and getJmxPort. I assume if user uses specific options for each demon, for example enabling JMX for name node daemon by passing in HADOOP_NAMENODE_OPTS,
          export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OPTS -Dcom.sun.management.jmxremote.port=8004″
          then the parameter to be searched will be changed from "HADOOP_OPTS" to "HADOOP_NAMENODE_OPTS". If my understanding is corret, I think the patch is good.

          Show
          Tanping Wang added a comment - In the newer patch, "HADOOP_OPTS" is passed in as parameter to check isJmxEnabled and getJmxPort. I assume if user uses specific options for each demon, for example enabling JMX for name node daemon by passing in HADOOP_NAMENODE_OPTS, export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OPTS -Dcom.sun.management.jmxremote.port=8004″ then the parameter to be searched will be changed from "HADOOP_OPTS" to "HADOOP_NAMENODE_OPTS". If my understanding is corret, I think the patch is good.
          Konstantin Boudnik made changes -
          Attachment HADOOP-6977.patch [ 12457229 ]
          Hide
          Konstantin Boudnik added a comment -

          Making some changes to allow for daemon specific lookups into process environment. Specific implementations will be posted as hdfs/mr patches.

          Show
          Konstantin Boudnik added a comment - Making some changes to allow for daemon specific lookups into process environment. Specific implementations will be posted as hdfs/mr patches.
          Hide
          Konstantin Boudnik added a comment -

          In fact, we can enable JMX ( with specific port number ) by modifying command specific options

          I think this is valid and I'll make the change to look for JMX specific parameters in a daemon specific options (i.e. HADOOP_NAMENODE_OPTS) as well as in HADOOP_OPTS. That should cover all the bases.

          Typo: check, good catch.

          FinderTest was an example of loops optimization and isn't a part of the patch.

          Show
          Konstantin Boudnik added a comment - In fact, we can enable JMX ( with specific port number ) by modifying command specific options I think this is valid and I'll make the change to look for JMX specific parameters in a daemon specific options (i.e. HADOOP_NAMENODE_OPTS) as well as in HADOOP_OPTS. That should cover all the bases. Typo: check, good catch. FinderTest was an example of loops optimization and isn't a part of the patch.
          Hide
          Tanping Wang added a comment -
          1. I looked at HADOOP-6977.patch. Overall I think it is good. One concern that I had was that in the methods of
            • getJmxPortNumber()
              and
            • isJmxEnabled() ,

          "HADOOP_OPTS" was hard-coded and searched.

          This require us to enable remote JMX by putting all the JMX options into HADOOP_OPTS. In fact, we can enable JMX ( with specific port number ) by modifying command specific options, e.g.

          1. Command specific options appended to HADOOP_OPTS when specified
            export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OPTS -Dcom.sun.management.jmxremote.port=8004″
            export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_SECONDARYNAMENODE_OPTS -Dcom.sun.management.jmxremote.port=8005″
            export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_DATANODE_OPTS -Dcom.sun.management.jmxremote.port=8006″
            export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_BALANCER_OPTS -Dcom.sun.management.jmxremote.port=8007″
            export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_JOBTRACKER_OPTS -Dcom.sun.management.jmxremote.port=8008″
            export HADOOP_TASKTRACKER_OPTS="-Dcom.sun.management.jmxremote.port=8009″

          There is also a typo
          /**
          Create connection with the remove JMX server at given host and port

          I believe you mean remote but not remove?

          1. FinderTest.java
            This test should be refactor to follow Junit 4 ?
          Show
          Tanping Wang added a comment - I looked at HADOOP-6977 .patch. Overall I think it is good. One concern that I had was that in the methods of getJmxPortNumber() and isJmxEnabled() , "HADOOP_OPTS" was hard-coded and searched. This require us to enable remote JMX by putting all the JMX options into HADOOP_OPTS. In fact, we can enable JMX ( with specific port number ) by modifying command specific options, e.g. Command specific options appended to HADOOP_OPTS when specified export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OPTS -Dcom.sun.management.jmxremote.port=8004″ export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_SECONDARYNAMENODE_OPTS -Dcom.sun.management.jmxremote.port=8005″ export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_DATANODE_OPTS -Dcom.sun.management.jmxremote.port=8006″ export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_BALANCER_OPTS -Dcom.sun.management.jmxremote.port=8007″ export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_JOBTRACKER_OPTS -Dcom.sun.management.jmxremote.port=8008″ export HADOOP_TASKTRACKER_OPTS="-Dcom.sun.management.jmxremote.port=8009″ There is also a typo /** Create connection with the remove JMX server at given host and port I believe you mean remote but not remove? FinderTest.java This test should be refactor to follow Junit 4 ?
          Hide
          Konstantin Boudnik added a comment -

          I have ran test-patch.sh locally and all look Ok

          +1 overall.
          
              +1 @author.  The patch does not contain any @author tags.
          
              +1 tests included.  The patch appears to include 3 new or modified tests.
          
              +1 javadoc.  The javadoc tool did not generate any warning messages.
          
              +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
          
              +1 findbugs.  The patch does not introduce any new Findbugs warnings.
          
              +1 release audit.  The applied patch does not increase the total number of release audit warnings.
          
              +1 system tests framework.  The patch passed system tests framework compile.
          

          Also, I used the version of Common built with this patch to run TestHL040 from HDFS-1408 and it passed. So, it seems that patch is ready to be committed to trunk.

          Show
          Konstantin Boudnik added a comment - I have ran test-patch.sh locally and all look Ok +1 overall. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 system tests framework. The patch passed system tests framework compile. Also, I used the version of Common built with this patch to run TestHL040 from HDFS-1408 and it passed. So, it seems that patch is ready to be committed to trunk.
          Konstantin Boudnik made changes -
          Attachment HADOOP-6977.patch [ 12456348 ]
          Hide
          Konstantin Boudnik added a comment -

          Thanks Al: totally makes sense. The code looks much cleaner now. I guess I need to abandon my JDK1.1 code habits and learn how big guys are doing it nowadays.

          Show
          Konstantin Boudnik added a comment - Thanks Al: totally makes sense. The code looks much cleaner now. I guess I need to abandon my JDK1.1 code habits and learn how big guys are doing it nowadays.
          Al Thompson made changes -
          Attachment FinderTest.java [ 12456326 ]
          Hide
          Al Thompson added a comment -

          A simple JUnit test which illustrates alternatives to searching for a key in a String array.

          Show
          Al Thompson added a comment - A simple JUnit test which illustrates alternatives to searching for a key in a String array.
          Hide
          Al Thompson added a comment -

          I have reviewed the patch in the most recent attachment. In doing so, it occurred to me that the for loop to search for JMX options in the HADOOP_OPTS environment variable could be simplified. This can be done by using either List.contains(..) or Arrays.binarySearch(..). The former may be preferred, as the latter requires the collection to be sorted before invoking binarySearch(). I will attach a small sample that illustrates both of these.

          I don't think changing this implementation will impact performance much, good or bad. IMHO the code change would result in cleaner code and provide an examplar to other committers on how not to re-implement the Java libraries in Hadoop code.

          Show
          Al Thompson added a comment - I have reviewed the patch in the most recent attachment. In doing so, it occurred to me that the for loop to search for JMX options in the HADOOP_OPTS environment variable could be simplified. This can be done by using either List.contains(..) or Arrays.binarySearch(..). The former may be preferred, as the latter requires the collection to be sorted before invoking binarySearch(). I will attach a small sample that illustrates both of these. I don't think changing this implementation will impact performance much, good or bad. IMHO the code change would result in cleaner code and provide an examplar to other committers on how not to re-implement the Java libraries in Hadoop code.
          Konstantin Boudnik made changes -
          Attachment HADOOP-6977.y20S.patch [ 12455880 ]
          Attachment HADOOP-6977.patch [ 12455881 ]
          Hide
          Konstantin Boudnik added a comment -

          Addressing Sharad's comments.

          Show
          Konstantin Boudnik added a comment - Addressing Sharad's comments.
          Hide
          Sharad Agarwal added a comment -

          Comments on https://issues.apache.org/jira/secure/attachment/12455784/HADOOP-6977.patch

          • jmxAttributes cache is redundant. put and get are called in same call.
          • Some methods getJmxBeanName and establishJmxConnection should throw exception instead of suppressing them.
          Show
          Sharad Agarwal added a comment - Comments on https://issues.apache.org/jira/secure/attachment/12455784/HADOOP-6977.patch jmxAttributes cache is redundant. put and get are called in same call. Some methods getJmxBeanName and establishJmxConnection should throw exception instead of suppressing them.
          Konstantin Boudnik made changes -
          Link This issue blocks MAPREDUCE-2093 [ MAPREDUCE-2093 ]
          Konstantin Boudnik made changes -
          Link This issue blocks HDFS-1408 [ HDFS-1408 ]
          Hide
          Philip Zeyliger added a comment -

          Konstantin,

          Sounds, well, sound.

          Show
          Philip Zeyliger added a comment - Konstantin, Sounds, well, sound.
          Hide
          Konstantin Boudnik added a comment -

          Separating out the presentation and the data in dfshealth.jsp would be a boon all around

          The idea of this one is slightly different: instead of making *.jsp better modeled I'm going to provide a way for a system test to reach JMX beans and pull down any information available there. I agree that built-in testability is a better answer for many problems, however this fix is about making already available information be visible in the cluster testing.

          Show
          Konstantin Boudnik added a comment - Separating out the presentation and the data in dfshealth.jsp would be a boon all around The idea of this one is slightly different: instead of making *.jsp better modeled I'm going to provide a way for a system test to reach JMX beans and pull down any information available there. I agree that built-in testability is a better answer for many problems, however this fix is about making already available information be visible in the cluster testing.
          Konstantin Boudnik made changes -
          Attachment HADOOP-6977.y20S.patch [ 12455786 ]
          Hide
          Konstantin Boudnik added a comment -

          Moving the patch over from HDFS-1408: common part belong to Hadoop component. Also, this is the patch for y20-security.

          Show
          Konstantin Boudnik added a comment - Moving the patch over from HDFS-1408 : common part belong to Hadoop component. Also, this is the patch for y20-security.
          Konstantin Boudnik made changes -
          Attachment HADOOP-6977.patch [ 12455784 ]
          Hide
          Konstantin Boudnik added a comment -

          Common part of large HDFS-1408 which has been actually made for y20-security.

          Show
          Konstantin Boudnik added a comment - Common part of large HDFS-1408 which has been actually made for y20-security.
          Hide
          Philip Zeyliger added a comment -

          Separating out the presentation and the data in dfshealth.jsp would be a boon all around. It would make testing easier, and it would open the way for supporting other (XML, JSON, whatever) ways to get at those things. +1 for the idea.

          Show
          Philip Zeyliger added a comment - Separating out the presentation and the data in dfshealth.jsp would be a boon all around. It would make testing easier, and it would open the way for supporting other (XML, JSON, whatever) ways to get at those things. +1 for the idea.
          Konstantin Boudnik made changes -
          Summary CLONE -Herriot NN and DN clients should vend statistics Herriot daemon clients should vend statistics
          Description The HDFS web user interface serves useful information through dfshealth.jsp and dfsnodelist.jsp.

          The Herriot interface to the namenode and datanode (as implemented in NNClient and DNClient, respectively) would benefit from the addition of some way to channel this information. In the case of DNClient this can be an injected method that returns a DatanodeDescriptor relevant to the underlying datanode.

          There seems to be no analagous NamenodeDescriptor. It may be useful to add this as a facade to a visitor that aggregates values across the filesystem datanodes. These values are (from dfshealth JSP):

          Configured Capacity
          DFS Used
          Non DFS Used
          DFS Remaining
          DFS Used%
          DFS Remaining%
          Live Nodes
          Dead Nodes
          Decommissioning Nodes
          Number of Under-Replicated Blocks

          Attributes reflecting the web user interface header may also be useful such as When-Started, Version, When-Compiled, and Upgrade-Status.

          A NamenodeDescriptor would essentially "push down" the code in dfshealth web UI behind a more general abstraction. If it is objectionable to make this class available in HDFS, perhaps this could be packaged in a Herriot specific way.
          The HDFS web user interface serves useful information through dfshealth.jsp and dfsnodelist.jsp.

          The Herriot interface to Hadoop cluster daemons would benefit from the addition of some way to channel metics information.
          Konstantin Boudnik made changes -
          Project Hadoop HDFS [ 12310942 ] Hadoop Common [ 12310240 ]
          Key HDFS-1423 HADOOP-6977
          Affects Version/s 0.22.0 [ 12314296 ]
          Affects Version/s 0.22.0 [ 12314241 ]
          Component/s test [ 12311440 ]
          Component/s test [ 12312916 ]
          Konstantin Boudnik made changes -
          Field Original Value New Value
          Link This issue is a clone of HDFS-1408 [ HDFS-1408 ]
          Konstantin Boudnik created issue -

            People

            • Assignee:
              Konstantin Boudnik
              Reporter:
              Konstantin Boudnik
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development