Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.5.0
    • Component/s: bsp core
    • Labels:
      None

      Description

      HAMA-363 provides a way to access to cluster status. Some additional features, such as federation, are required so the system can have better coordination, and the master would have ideas about the whole system.

      1. HAMA-495.patch
        41 kB
        ChiaHung Lin

        Activity

        Hide
        ChiaHung Lin added a comment - - edited

        Patch attached tries to provide functions similar to Ganglia's gmond. User can create plugin (as a task) to be executed by Monitor service, in which Collector will execute tasks (e.g. harvest metrics), Publisher will push data to a place accessible by e.g. master server.

        Patch still needs to be improved by e.g. providing federation, etc.

        A small test is done by running on a few vms, but more test is needed.

        Show
        ChiaHung Lin added a comment - - edited Patch attached tries to provide functions similar to Ganglia's gmond. User can create plugin (as a task) to be executed by Monitor service, in which Collector will execute tasks (e.g. harvest metrics), Publisher will push data to a place accessible by e.g. master server. Patch still needs to be improved by e.g. providing federation, etc. A small test is done by running on a few vms, but more test is needed.
        Hide
        Edward J. Yoon added a comment -

        Could you specify Fix Version/s for this issue?

        Show
        Edward J. Yoon added a comment - Could you specify Fix Version/s for this issue?
        Hide
        Edward J. Yoon added a comment -

        Patch looks good. I would suggest you to commit first before other changes are made.

        Show
        Edward J. Yoon added a comment - Patch looks good. I would suggest you to commit first before other changes are made.
        Hide
        ChiaHung Lin added a comment -

        The patch is committed.

        Show
        ChiaHung Lin added a comment - The patch is committed.
        Hide
        Thomas Jungblut added a comment -

        The build fails within TestZkUtil.

        junit.framework.AssertionFailedError: Make sure token are 4. expected:<1> but was:<4>
        	at junit.framework.Assert.fail(Assert.java:47)
        	at junit.framework.Assert.failNotEquals(Assert.java:283)
        	at junit.framework.Assert.assertEquals(Assert.java:64)
        	at junit.framework.Assert.assertEquals(Assert.java:195)
        	at org.apache.hama.util.TestZKUtil.setUp(TestZKUtil.java:69)
        	at junit.framework.TestCase.runBare(TestCase.java:132)
        	at junit.framework.TestResult$1.protect(TestResult.java:110)
        	at junit.framework.TestResult.runProtected(TestResult.java:128)
        	at junit.framework.TestResult.run(TestResult.java:113)
        	at junit.framework.TestCase.run(TestCase.java:124)
        	at junit.framework.TestSuite.runTest(TestSuite.java:232)
        	at junit.framework.TestSuite.run(TestSuite.java:227)
        	at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
        	at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
        	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
        	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
        	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
        	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
        	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
        
        
        Show
        Thomas Jungblut added a comment - The build fails within TestZkUtil. junit.framework.AssertionFailedError: Make sure token are 4. expected:<1> but was:<4> at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.failNotEquals(Assert.java:283) at junit.framework.Assert.assertEquals(Assert.java:64) at junit.framework.Assert.assertEquals(Assert.java:195) at org.apache.hama.util.TestZKUtil.setUp(TestZKUtil.java:69) at junit.framework.TestCase.runBare(TestCase.java:132) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:232) at junit.framework.TestSuite.run(TestSuite.java:227) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83) at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
        Hide
        Hudson added a comment -

        Integrated in Hama-Nightly #466 (See https://builds.apache.org/job/Hama-Nightly/466/)
        HAMA-495 provides features similar to ganglia's gmond so that users can monitor
        service. (Revision 1293544)

        Result = SUCCESS
        chl501 :
        Files :

        • /incubator/hama/trunk/contrib
        • /incubator/hama/trunk/contrib/monitor-plugin
        • /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics
        • /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics/pom.xml
        • /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics/src
        • /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics/src/main
        • /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics/src/main/java
        • /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics/src/main/java/org
        • /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics/src/main/java/org/apache
        • /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics/src/main/java/org/apache/hama
        • /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics/src/main/java/org/apache/hama/monitor
        • /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics/src/main/java/org/apache/hama/monitor/plugin
        • /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics/src/main/java/org/apache/hama/monitor/plugin/JvmTask.java
        • /incubator/hama/trunk/core/src/main/java/org/apache/hama/bsp/GroomServer.java
        • /incubator/hama/trunk/core/src/main/java/org/apache/hama/monitor
        • /incubator/hama/trunk/core/src/main/java/org/apache/hama/monitor/Configurator.java
        • /incubator/hama/trunk/core/src/main/java/org/apache/hama/monitor/Metric.java
        • /incubator/hama/trunk/core/src/main/java/org/apache/hama/monitor/MetricsRecord.java
        • /incubator/hama/trunk/core/src/main/java/org/apache/hama/monitor/MetricsTag.java
        • /incubator/hama/trunk/core/src/main/java/org/apache/hama/monitor/Monitor.java
        • /incubator/hama/trunk/core/src/main/java/org/apache/hama/monitor/MonitorListener.java
        • /incubator/hama/trunk/core/src/main/java/org/apache/hama/util/ZKUtil.java
        • /incubator/hama/trunk/core/src/test/java/org/apache/hama/util/TestZKUtil.java
        Show
        Hudson added a comment - Integrated in Hama-Nightly #466 (See https://builds.apache.org/job/Hama-Nightly/466/ ) HAMA-495 provides features similar to ganglia's gmond so that users can monitor service. (Revision 1293544) Result = SUCCESS chl501 : Files : /incubator/hama/trunk/contrib /incubator/hama/trunk/contrib/monitor-plugin /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics/pom.xml /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics/src /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics/src/main /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics/src/main/java /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics/src/main/java/org /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics/src/main/java/org/apache /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics/src/main/java/org/apache/hama /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics/src/main/java/org/apache/hama/monitor /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics/src/main/java/org/apache/hama/monitor/plugin /incubator/hama/trunk/contrib/monitor-plugin/jvm-metrics/src/main/java/org/apache/hama/monitor/plugin/JvmTask.java /incubator/hama/trunk/core/src/main/java/org/apache/hama/bsp/GroomServer.java /incubator/hama/trunk/core/src/main/java/org/apache/hama/monitor /incubator/hama/trunk/core/src/main/java/org/apache/hama/monitor/Configurator.java /incubator/hama/trunk/core/src/main/java/org/apache/hama/monitor/Metric.java /incubator/hama/trunk/core/src/main/java/org/apache/hama/monitor/MetricsRecord.java /incubator/hama/trunk/core/src/main/java/org/apache/hama/monitor/MetricsTag.java /incubator/hama/trunk/core/src/main/java/org/apache/hama/monitor/Monitor.java /incubator/hama/trunk/core/src/main/java/org/apache/hama/monitor/MonitorListener.java /incubator/hama/trunk/core/src/main/java/org/apache/hama/util/ZKUtil.java /incubator/hama/trunk/core/src/test/java/org/apache/hama/util/TestZKUtil.java
        Hide
        ChiaHung Lin added a comment -

        Does the build still fail? From the console output[1], looks like the patches passes the hudson's building.

        Checking TestZKUtil file, it fails in counting tokens

            this.path = "/monitor/groom_lab01_61000/metrics/jvm";
            StringTokenizer token = new StringTokenizer(path, File.separator);
            int count = token.countTokens(); // should be 4
            assertEquals("Make sure token are 4.", count, 4);
        

        But that seems not the case because the path is fixed and is not modified. Do we have more output or debug info in e.g.surefire-reports/org.apache.hama.util.TestZKUtil.txt? Maybe something I miss.

        [1]. https://builds.apache.org/job/Hama-Nightly/466/console

        Show
        ChiaHung Lin added a comment - Does the build still fail? From the console output [1] , looks like the patches passes the hudson's building. Checking TestZKUtil file, it fails in counting tokens this .path = "/monitor/groom_lab01_61000/metrics/jvm" ; StringTokenizer token = new StringTokenizer(path, File.separator); int count = token.countTokens(); // should be 4 assertEquals( "Make sure token are 4." , count, 4); But that seems not the case because the path is fixed and is not modified. Do we have more output or debug info in e.g.surefire-reports/org.apache.hama.util.TestZKUtil.txt? Maybe something I miss. [1] . https://builds.apache.org/job/Hama-Nightly/466/console
        Hide
        Thomas Jungblut added a comment -

        Ah that is a windows problem. I guess we have to change File.separator to "/".

        Show
        Thomas Jungblut added a comment - Ah that is a windows problem. I guess we have to change File.separator to "/".
        Hide
        Thomas Jungblut added a comment -

        I have added ZKUtil.ZK_SEPARATOR, so it should work now.

        -> Resolved.

        Show
        Thomas Jungblut added a comment - I have added ZKUtil.ZK_SEPARATOR, so it should work now. -> Resolved.
        Hide
        ChiaHung Lin added a comment -

        If that's window problem (i.e. slash or backslash), shouldn't we use File.separator instead of hard coded with slash string such as ZK_SEPARATOR?

        The original problem seems to stem from the hard coded path "/monitor/groom_lab01_61000/metrics/jvm". Running on Windows os, the StringTokeinizer's delimiter becomes backslash, hence the count is 1.

        Show
        ChiaHung Lin added a comment - If that's window problem (i.e. slash or backslash), shouldn't we use File.separator instead of hard coded with slash string such as ZK_SEPARATOR? The original problem seems to stem from the hard coded path "/monitor/groom_lab01_61000/metrics/jvm". Running on Windows os, the StringTokeinizer's delimiter becomes backslash, hence the count is 1.
        Hide
        Thomas Jungblut added a comment -

        I think that ZK handles its path on disk separately than the name of the znode paths. I haven't faced any issues yet with the normal slashes. However the backslashes make the testcase fail.

        The original problem seems to stem from the hard coded path "/monitor/groom_lab01_61000/metrics/jvm". Running on Windows os, the StringTokeinizer's delimiter becomes backslash, hence the count is 1.

        Yes. We can change the "/" in ZKUtils to File.separator. But then we should retrieve the ZK paths through the utility which uses the separator. Otherwise we end up with n-separators and y-paths.

        Show
        Thomas Jungblut added a comment - I think that ZK handles its path on disk separately than the name of the znode paths. I haven't faced any issues yet with the normal slashes. However the backslashes make the testcase fail. The original problem seems to stem from the hard coded path "/monitor/groom_lab01_61000/metrics/jvm". Running on Windows os, the StringTokeinizer's delimiter becomes backslash, hence the count is 1. Yes. We can change the "/" in ZKUtils to File.separator. But then we should retrieve the ZK paths through the utility which uses the separator. Otherwise we end up with n-separators and y-paths.
        Hide
        ChiaHung Lin added a comment -

        You are right. ZK handles path in unix style.
        Thanks for solving the problem.

        Show
        ChiaHung Lin added a comment - You are right. ZK handles path in unix style. Thanks for solving the problem.
        Hide
        Edward J. Yoon added a comment -
        12/02/27 13:59:50 INFO monitor.Monitor: How many workers will be executed by collector? 0
        

        Should we change the log level to debug for this message?

        Show
        Edward J. Yoon added a comment - 12/02/27 13:59:50 INFO monitor.Monitor: How many workers will be executed by collector? 0 Should we change the log level to debug for this message?
        Hide
        ChiaHung Lin added a comment -

        Log level is changed so output messages would be reduced.

        Show
        ChiaHung Lin added a comment - Log level is changed so output messages would be reduced.
        Hide
        Edward J. Yoon added a comment -
        
        slave.udanax.org: starting groom, logging to /home/edward/workspace/hama-trunk/bin/../logs/hama-edward-groom-slave.out
        tweetple.com: Exception in thread "Thread-25" java.lang.NullPointerException
        tweetple.com: 	at org.apache.hama.monitor.Configurator.configure(Configurator.java:62)
        tweetple.com: 	at org.apache.hama.monitor.Monitor$Initializer.run(Monitor.java:350)
        slave.udanax.org: Exception in thread "Thread-25" java.lang.NullPointerException
        slave.udanax.org: 	at org.apache.hama.monitor.Configurator.configure(Configurator.java:62)
        slave.udanax.org: 	at org.apache.hama.monitor.Monitor$Initializer.run(Monitor.java:350)
        edward@slave:~/workspace/hama-trunk$ bin/stop-bspd.sh 
        stopping bspmaster
        tweetple.com: stopping groom
        slave.udanax.org: stopping groom
        slave.udanax.org: stopping zookeeper
        edward@slave:~/workspace/hama-trunk$ bin/start-bspd.sh 
        slave.udanax.org: starting zookeeper, logging to /home/edward/workspace/hama-trunk/bin/../logs/hama-edward-zookeeper-slave.out
        starting bspmaster, logging to /home/edward/workspace/hama-trunk/bin/../logs/hama-edward-bspmaster-slave.out
        tweetple.com: starting groom, logging to /home/edward/workspace/hama-trunk/bin/../logs/hama-edward-groom-tweetple.com.out
        slave.udanax.org: starting groom, logging to /home/edward/workspace/hama-trunk/bin/../logs/hama-edward-groom-slave.out
        edward@slave:~/workspace/hama-trunk$ 
        

        Sometimes, it throws NullPointerExceptions.

        Show
        Edward J. Yoon added a comment - slave.udanax.org: starting groom, logging to /home/edward/workspace/hama-trunk/bin/../logs/hama-edward-groom-slave.out tweetple.com: Exception in thread " Thread -25" java.lang.NullPointerException tweetple.com: at org.apache.hama.monitor.Configurator.configure(Configurator.java:62) tweetple.com: at org.apache.hama.monitor.Monitor$Initializer.run(Monitor.java:350) slave.udanax.org: Exception in thread " Thread -25" java.lang.NullPointerException slave.udanax.org: at org.apache.hama.monitor.Configurator.configure(Configurator.java:62) slave.udanax.org: at org.apache.hama.monitor.Monitor$Initializer.run(Monitor.java:350) edward@slave:~/workspace/hama-trunk$ bin/stop-bspd.sh stopping bspmaster tweetple.com: stopping groom slave.udanax.org: stopping groom slave.udanax.org: stopping zookeeper edward@slave:~/workspace/hama-trunk$ bin/start-bspd.sh slave.udanax.org: starting zookeeper, logging to /home/edward/workspace/hama-trunk/bin/../logs/hama-edward-zookeeper-slave.out starting bspmaster, logging to /home/edward/workspace/hama-trunk/bin/../logs/hama-edward-bspmaster-slave.out tweetple.com: starting groom, logging to /home/edward/workspace/hama-trunk/bin/../logs/hama-edward-groom-tweetple.com.out slave.udanax.org: starting groom, logging to /home/edward/workspace/hama-trunk/bin/../logs/hama-edward-groom-slave.out edward@slave:~/workspace/hama-trunk$ Sometimes, it throws NullPointerExceptions.
        Hide
        ChiaHung Lin added a comment -

        Is bsp.monitor.plugins.dir configured pointing to jar file? For example,

            <property>
              <name>bsp.monitor.plugins.dir</name>
              <value>/home/UserA/hama-trunk/plugins/jvm-plugin-0.1-SNAPSHOT.jar</value>
            </property>
        

        If it is the case, then that should be removed. Or configure to point it to directory e.g. plugins

            <property>
              <name>bsp.monitor.plugins.dir</name>
              <value>/home/UserA/hama-trunk/plugins</value>
            </property>
        

        Service will scan plugins directory (File.listFiles()) looking for jar files and loads jar files to execute the task. Therefore, pointing it to file will throw NPE.

        Show
        ChiaHung Lin added a comment - Is bsp.monitor.plugins.dir configured pointing to jar file? For example, <property> <name>bsp.monitor.plugins.dir</name> <value>/home/UserA/hama-trunk/plugins/jvm-plugin-0.1-SNAPSHOT.jar</value> </property> If it is the case, then that should be removed. Or configure to point it to directory e.g. plugins <property> <name>bsp.monitor.plugins.dir</name> <value>/home/UserA/hama-trunk/plugins</value> </property> Service will scan plugins directory (File.listFiles()) looking for jar files and loads jar files to execute the task. Therefore, pointing it to file will throw NPE.
        Hide
        Edward J. Yoon added a comment -

        Hi Chiahung,

        Should we add this plugin as a default to 0.5 release? or handle NullPointerExceptions?

        Show
        Edward J. Yoon added a comment - Hi Chiahung, Should we add this plugin as a default to 0.5 release? or handle NullPointerExceptions?
        Hide
        Thomas Jungblut added a comment -

        Why not both?

        Show
        Thomas Jungblut added a comment - Why not both?
        Hide
        Edward J. Yoon added a comment -

        +1 xD

        Show
        Edward J. Yoon added a comment - +1 xD

          People

          • Assignee:
            ChiaHung Lin
            Reporter:
            ChiaHung Lin
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development