Hadoop Common
  1. Hadoop Common
  2. HADOOP-4675

Current Ganglia metrics implementation is incompatible with Ganglia 3.1

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.22.0
    • Component/s: metrics
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Support for reporting metrics to Ganglia 3.1 servers
    • Tags:
      ganglia metrics

      Description

      Ganglia changed its wire protocol in the 3.1.x series; the current implementation only works for 3.0.x.

      1. hadoop-4675.patch
        14 kB
        Brian Bockelman
      2. hadoop-4675-2.patch
        14 kB
        Brian Bockelman
      3. hadoop-4675-3.patch
        10 kB
        zhangwei
      4. HADOOP-4675-4.patch
        9 kB
        zhangwei
      5. HADOOP-4675-v5.patch
        14 kB
        Brian Bockelman
      6. HADOOP-4675-v6.patch
        20 kB
        Brian Bockelman
      7. HADOOP-4675-v7.patch
        16 kB
        Brian Bockelman
      8. HADOOP-4675-v8.patch
        15 kB
        Scott Beardsley
      9. HADOOP-4675-v9.patch
        15 kB
        Matt Massie
      10. HADOOP-4675.patch
        9 kB
        Tom White

        Issue Links

          Activity

          Hide
          Brian Bockelman added a comment -

          Looking at the new Ganglia wire protocol, we'll probably need to do the following:
          1) Implement more XDR serializations. 3.0.x protocol sends all data as strings; 3.1.x correctly sends float data as XDR floats.
          2) Test and verify sending Longs as floats in Ganglia actually works. This has been discussed, but I'm not sure it's entirely functional... maybe get a test case going?
          3) Probably this will result in a new GangliaContext31 class, as I don't see any clean way to merge the two protocols together into one class.

          • The patch for HADOOP-3422 caused a new metric name to be generated in the emitRecord class; we could reuse this if it was separated out into a new function.

          Brian

          Show
          Brian Bockelman added a comment - Looking at the new Ganglia wire protocol, we'll probably need to do the following: 1) Implement more XDR serializations. 3.0.x protocol sends all data as strings; 3.1.x correctly sends float data as XDR floats. 2) Test and verify sending Longs as floats in Ganglia actually works. This has been discussed, but I'm not sure it's entirely functional... maybe get a test case going? 3) Probably this will result in a new GangliaContext31 class, as I don't see any clean way to merge the two protocols together into one class. The patch for HADOOP-3422 caused a new metric name to be generated in the emitRecord class; we could reuse this if it was separated out into a new function. Brian
          Hide
          Brian Bockelman added a comment -

          The attached patch (which includes the patch for HADOOP-3422 in order to make things work...) should provide Ganglia 3.1 support.

          Note that this patch probably will be conflicting in Hudson until HADOOP-3422 lands in trunk. I'll rebase things then.

          Show
          Brian Bockelman added a comment - The attached patch (which includes the patch for HADOOP-3422 in order to make things work...) should provide Ganglia 3.1 support. Note that this patch probably will be conflicting in Hudson until HADOOP-3422 lands in trunk. I'll rebase things then.
          Hide
          Brian Bockelman added a comment -

          Fix a build issue.

          Show
          Brian Bockelman added a comment - Fix a build issue.
          Hide
          zhangwei added a comment -

          I test the patch OK.And I suggest to add the group attribute in the extra data
          the metric in Ganglia3.1.* has the attribute,so i think to put the prefix of the metric name to the group name.
          for example:
          metric :dfs.namenode.FilesCreated --> group: dfs.namenode

          Show
          zhangwei added a comment - I test the patch OK.And I suggest to add the group attribute in the extra data the metric in Ganglia3.1.* has the attribute,so i think to put the prefix of the metric name to the group name. for example: metric :dfs.namenode.FilesCreated --> group: dfs.namenode
          Hide
          zhangwei added a comment - - edited

          hadoop-4675-3.patch :add group attribute which use the prefix of the metric name.

          Show
          zhangwei added a comment - - edited hadoop-4675-3.patch :add group attribute which use the prefix of the metric name.
          Hide
          Brian Bockelman added a comment -

          Hey zhangwei,

          The group attribute stuff is a good improvement. Have you tested it on your cluster? I'll perhaps get to do this today or tomorrow; we're going to be upgrading to Hadoop 0.19.0 soon, which will allow me to slip a few new patches in.

          Brian

          Show
          Brian Bockelman added a comment - Hey zhangwei, The group attribute stuff is a good improvement. Have you tested it on your cluster? I'll perhaps get to do this today or tomorrow; we're going to be upgrading to Hadoop 0.19.0 soon, which will allow me to slip a few new patches in. Brian
          Hide
          zhangwei added a comment -

          Hi, Brian

          I tested it on my cluster,which has ganglia 3.1.1 installed.I add a little code simply based on your patch:

          + String GroupName = name.substring(0,name.lastIndexOf("."))
          . . . . . .
          xdr_int(dmax); // dmax, the maximum data value
          //xdr_string(""); // Empty extra_value field for Ganglia 3.1.
          + xdr_int(1); //this means Num Of the Element follow by
          + xdr_string("GROUP");
          + xdr_string(GroupName);
          for (SocketAddress socketAddress : metricsServers) {
          DatagramPacket packet

          And it seems OK.

          Show
          zhangwei added a comment - Hi, Brian I tested it on my cluster,which has ganglia 3.1.1 installed.I add a little code simply based on your patch: + String GroupName = name.substring(0,name.lastIndexOf(".")) . . . . . . xdr_int(dmax); // dmax, the maximum data value //xdr_string(""); // Empty extra_value field for Ganglia 3.1. + xdr_int(1); //this means Num Of the Element follow by + xdr_string("GROUP"); + xdr_string(GroupName); for (SocketAddress socketAddress : metricsServers) { DatagramPacket packet And it seems OK.
          Hide
          Brian Bockelman added a comment -

          I can confirm that the alterations to my patch do indeed appear to work.

          Submitting patch to Hudson.

          Show
          Brian Bockelman added a comment - I can confirm that the alterations to my patch do indeed appear to work. Submitting patch to Hudson.
          Hide
          zhangwei added a comment -

          I follow the wiki "HowToContribute" step by step and create the new patch.I hope it woks

          Show
          zhangwei added a comment - I follow the wiki "HowToContribute" step by step and create the new patch.I hope it woks
          Hide
          zhangwei added a comment -

          the patch's content use chinese chars: 工作副本 which means "revision"
          I translate into english.

          Show
          zhangwei added a comment - the patch's content use chinese chars: 工作副本 which means "revision" I translate into english.
          Hide
          steve_l added a comment -
          1. does this break ganglia 3.0?
          2. The code works out its hostname in a static constructor
          +        InetAddress localMachine = InetAddress.getLocalHost();
          +        hostName = localMachine.getHostName();
          

          For 0.20 and onwards, this patch should use the code in org.apache.hadoop.net.DNS, rather than try and do this itself. The DNS class has caching and fallback for machines that don't know who they are.

          Show
          steve_l added a comment - does this break ganglia 3.0? The code works out its hostname in a static constructor + InetAddress localMachine = InetAddress.getLocalHost(); + hostName = localMachine.getHostName(); For 0.20 and onwards, this patch should use the code in org.apache.hadoop.net.DNS, rather than try and do this itself. The DNS class has caching and fallback for machines that don't know who they are.
          Hide
          Brian Bockelman added a comment -

          Hey Steve,

          1) This does not break Ganglia 3.0. It gives a subclass of the Ganglia 3.0 implementation that is compatible with 3.1.

          2) I'll have to work on this - we're currently using the patch for 0.19, so I haven't looked at the DNS class yet. Can you give an example (or fix the patch yourself) ?

          Show
          Brian Bockelman added a comment - Hey Steve, 1) This does not break Ganglia 3.0. It gives a subclass of the Ganglia 3.0 implementation that is compatible with 3.1. 2) I'll have to work on this - we're currently using the patch for 0.19, so I haven't looked at the DNS class yet. Can you give an example (or fix the patch yourself) ?
          Hide
          Brian Bockelman added a comment -

          I believe this works - Steve, can you look over my usage of the DNS class?

          Show
          Brian Bockelman added a comment - I believe this works - Steve, can you look over my usage of the DNS class?
          Hide
          Brian Bockelman added a comment -

          Ok - I think the patch is ready - why is there no "Submit Patch" from JIRA? Did I do something wrong?

          Show
          Brian Bockelman added a comment - Ok - I think the patch is ready - why is there no "Submit Patch" from JIRA? Did I do something wrong?
          Hide
          Brian Bockelman added a comment -

          Hm - found a bug in the v5 patch. This next one currently uses the DNS class correctly as Steve suggested.

          This next one includes a test case for Ganglia – it's quite simple. All it does is configure the MiniDFSCluster to report to Ganglia31, and then starts up a separate thread which listens for metrics. It verifies that the hostname is properly set inside the packet sent to Ganglia, but does not verify that all the formatting is correct.

          This appears to be the first test for Ganglia? Probably should have written it awhile ago.

          Show
          Brian Bockelman added a comment - Hm - found a bug in the v5 patch. This next one currently uses the DNS class correctly as Steve suggested. This next one includes a test case for Ganglia – it's quite simple. All it does is configure the MiniDFSCluster to report to Ganglia31, and then starts up a separate thread which listens for metrics. It verifies that the hostname is properly set inside the packet sent to Ganglia, but does not verify that all the formatting is correct. This appears to be the first test for Ganglia? Probably should have written it awhile ago.
          Hide
          steve_l added a comment -

          Looks good to me, +1.

          One thing to consider in future is moving the code to listen (and later validate) the ganglia 3.1 messages should be moved somewhere where it can be redistributed/reused in other tests and tools. That can wait until there is more validation.

          Show
          steve_l added a comment - Looks good to me, +1. One thing to consider in future is moving the code to listen (and later validate) the ganglia 3.1 messages should be moved somewhere where it can be redistributed/reused in other tests and tools. That can wait until there is more validation.
          Hide
          Brian Bockelman added a comment -

          Ok, submitting version 6 to Hudson. Includes a test case and correct DNS usage. Go Hudson, go!

          Show
          Brian Bockelman added a comment - Ok, submitting version 6 to Hudson. Includes a test case and correct DNS usage. Go Hudson, go!
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12404072/HADOOP-4675-v6.patch
          against trunk revision 763728.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/174/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12404072/HADOOP-4675-v6.patch against trunk revision 763728. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/174/console This message is automatically generated.
          Hide
          Brian Bockelman added a comment -

          v7 of the patch - I think I've fixed the 'patch' issue in Hudson - just a bum development environment on my laptop.

          Show
          Brian Bockelman added a comment - v7 of the patch - I think I've fixed the 'patch' issue in Hudson - just a bum development environment on my laptop.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12407207/HADOOP-4675-v7.patch
          against trunk revision 772688.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 3 new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          -1 release audit. The applied patch generated 476 release audit warnings (more than the trunk's current 475 warnings).

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/296/testReport/
          Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/296/artifact/trunk/current/releaseAuditDiffWarnings.txt
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/296/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/296/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/296/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12407207/HADOOP-4675-v7.patch against trunk revision 772688. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 3 new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. -1 release audit. The applied patch generated 476 release audit warnings (more than the trunk's current 475 warnings). -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/296/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/296/artifact/trunk/current/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/296/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/296/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/296/console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12407207/HADOOP-4675-v7.patch
          against trunk revision 772844.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 3 new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/299/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/299/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/299/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/299/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12407207/HADOOP-4675-v7.patch against trunk revision 772844. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 3 new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/299/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/299/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/299/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/299/console This message is automatically generated.
          Hide
          Johan Oskarsson added a comment -
          Show
          Johan Oskarsson added a comment - Could you have a look at these findbugs warnings and see if it is possible to fix those? http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/299/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Hide
          Scott Beardsley added a comment -

          add support for Ganglia 3.1 metrics

          Show
          Scott Beardsley added a comment - add support for Ganglia 3.1 metrics
          Hide
          Scott Beardsley added a comment -

          These minor changes in this[1] patch should fix the findbugs warnings in previous versions. Tested against rev 805676 (current trunk)

          [1] https://issues.apache.org/jira/secure/attachment/12416987/HADOOP-4675-v8.patch

          --------
          [exec] +1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] +1 tests included. The patch appears to include 3 new or modified tests.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
          [exec]
          [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.

          Show
          Scott Beardsley added a comment - These minor changes in this [1] patch should fix the findbugs warnings in previous versions. Tested against rev 805676 (current trunk) [1] https://issues.apache.org/jira/secure/attachment/12416987/HADOOP-4675-v8.patch -------- [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12416987/HADOOP-4675-v8.patch
          against trunk revision 804918.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/613/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/613/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/613/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/613/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12416987/HADOOP-4675-v8.patch against trunk revision 804918. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/613/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/613/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/613/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/613/console This message is automatically generated.
          Hide
          stack added a comment -

          I will commit this patch in next day or so unless objection.

          Show
          stack added a comment - I will commit this patch in next day or so unless objection.
          Hide
          George Porter added a comment -

          Any word on whether this is in the trunk now? Thanks!

          Show
          George Porter added a comment - Any word on whether this is in the trunk now? Thanks!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #9 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/9/)
          Current Ganglia metrics implementation is incompatible with Ganglia 3.1

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #9 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/9/ ) Current Ganglia metrics implementation is incompatible with Ganglia 3.1
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #10 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/10/)
          Current Ganglia metrics implementation is incompatible with Ganglia 3.1

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #10 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/10/ ) Current Ganglia metrics implementation is incompatible with Ganglia 3.1
          Hide
          stack added a comment -

          I applied the patch but then noticed that the unit test isn't being run; its got the wrong package on it and so applies outside of src/test/core. It then fails to compile because it uses minidfscluster it in the 'correct' location – under src/test/core. I reversed the patch until this is addressed.

          Show
          stack added a comment - I applied the patch but then noticed that the unit test isn't being run; its got the wrong package on it and so applies outside of src/test/core. It then fails to compile because it uses minidfscluster it in the 'correct' location – under src/test/core. I reversed the patch until this is addressed.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #11 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/11/)
          Current Ganglia metrics implementation is incompatible with Ganglia 3.1 – reversing this patch

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #11 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/11/ ) Current Ganglia metrics implementation is incompatible with Ganglia 3.1 – reversing this patch
          Hide
          stack added a comment -

          If someone can fix the patch so the ganglia test is applied to src/test/core and not to src/test I will apply (Will have to do another spin through hudson....). This patch has been working well for me for months.

          Show
          stack added a comment - If someone can fix the patch so the ganglia test is applied to src/test/core and not to src/test I will apply (Will have to do another spin through hudson....). This patch has been working well for me for months.
          Hide
          Matt Massie added a comment -

          Trivial change to drop TestGangliaContext31 into src/test/core instead of src/test per stack's request.

          Show
          Matt Massie added a comment - Trivial change to drop TestGangliaContext31 into src/test/core instead of src/test per stack's request.
          Hide
          Scott Beardsley added a comment -

          Thanks Matt, you beat me to it.

          Show
          Scott Beardsley added a comment - Thanks Matt, you beat me to it.
          Hide
          stack added a comment -

          Thanks lads.

          @Matt: Does it compile for you? I get this when I apply to current common TRUNK:

              [javac] Compiling 126 source files to /Users/stack/checkouts/common/trunk/build/test/core/classes
              [javac] /Users/stack/checkouts/common/trunk/src/test/core/org/apache/hadoop/metrics/TestGangliaContext31.java:31: package org.apache.hadoop.hdfs does not exist
              [javac] import org.apache.hadoop.hdfs.MiniDFSCluster;
              [javac]                              ^
          

          Here's my version:

          URL: https://svn.apache.org/repos/asf/hadoop/common/trunk
          Repository Root: https://svn.apache.org/repos/asf
          Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
          Revision: 811214
          ...
          

          Common doesn't bundle hdfs. I'm doing something wrong or the test no longer suits after the refactoring of hadoop into different modules.

          If no minidfscluster available in common, then the test doesn't fit the new constrained environment (though the test is creative with its setup of a listener to catch and verify the ganglia emissions). I'd be grand w/ removing the test. What you all think?

          Show
          stack added a comment - Thanks lads. @Matt: Does it compile for you? I get this when I apply to current common TRUNK: [javac] Compiling 126 source files to /Users/stack/checkouts/common/trunk/build/test/core/classes [javac] /Users/stack/checkouts/common/trunk/src/test/core/org/apache/hadoop/metrics/TestGangliaContext31.java:31: package org.apache.hadoop.hdfs does not exist [javac] import org.apache.hadoop.hdfs.MiniDFSCluster; [javac] ^ Here's my version: URL: https: //svn.apache.org/repos/asf/hadoop/common/trunk Repository Root: https: //svn.apache.org/repos/asf Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68 Revision: 811214 ... Common doesn't bundle hdfs. I'm doing something wrong or the test no longer suits after the refactoring of hadoop into different modules. If no minidfscluster available in common, then the test doesn't fit the new constrained environment (though the test is creative with its setup of a listener to catch and verify the ganglia emissions). I'd be grand w/ removing the test. What you all think?
          Hide
          Matt Massie added a comment -

          @stack: the patch compiles if you run, e.g...

          CLASSPATH="../hadoop-hdfs/build/hadoop-hdfs-0.21.0-dev.jar:../hadoop-hdfs/build/hadoop-hdfs-test-0.21.0-dev.jar" ant bin-package
          

          You're correct that MiniDFSCluster is not a part of hadoop-common but rather hadoop-hdfs.

          In the future, I will run something like the following to test patches since I think it closely mimics Hudson, e.g.

          ant test-patch -Dpatch.file=HADOOP-4675-v9.patch  -Dfindbugs.home=... -Djava5.home=... -Dforrest.home=...
          

          where none of the non-hadoop-common jars are in the classpath.

          I read through your test and agree: it is creative. It's a shame that MiniDFSCluster isn't available in common. Maybe the test should be commented out until Mini clusters are available for common tests? It's not ideal but having this patch linger on without resolution isn't ideal either. The code in the patch shouldn't be prone to bugs since it's simply sending specially crafted UDP messages. I trust that Java sockets have been thoroughly tested and the xdr_ functions are very straight-forward (e.g. xdr_int is basically htonl()).

          Btw, your approach to sending the ganglia meta data each time is practical. It can be optimized in the future but that might add unwelcome complexity.

          Show
          Matt Massie added a comment - @stack: the patch compiles if you run, e.g... CLASSPATH= "../hadoop-hdfs/build/hadoop-hdfs-0.21.0-dev.jar:../hadoop-hdfs/build/hadoop-hdfs-test-0.21.0-dev.jar" ant bin- package You're correct that MiniDFSCluster is not a part of hadoop-common but rather hadoop-hdfs. In the future, I will run something like the following to test patches since I think it closely mimics Hudson, e.g. ant test-patch -Dpatch.file=HADOOP-4675-v9.patch -Dfindbugs.home=... -Djava5.home=... -Dforrest.home=... where none of the non-hadoop-common jars are in the classpath. I read through your test and agree: it is creative. It's a shame that MiniDFSCluster isn't available in common. Maybe the test should be commented out until Mini clusters are available for common tests? It's not ideal but having this patch linger on without resolution isn't ideal either. The code in the patch shouldn't be prone to bugs since it's simply sending specially crafted UDP messages. I trust that Java sockets have been thoroughly tested and the xdr_ functions are very straight-forward (e.g. xdr_int is basically htonl()). Btw, your approach to sending the ganglia meta data each time is practical. It can be optimized in the future but that might add unwelcome complexity.
          Hide
          Tom White added a comment -

          There are a couple of options I can see.

          1. Convert the test to be more of a unit test of GangliaContext31. So rather than use MiniDFSCluster, inject metrics directly into the context and check that they are correctly received by the socket listener.
          2. Move the test to the HDFS project as an integration test.

          Ideally we'd do both, since 1 is a unit test for Ganglia, and 2 tests the integration of metrics and HDFS.

          Show
          Tom White added a comment - There are a couple of options I can see. 1. Convert the test to be more of a unit test of GangliaContext31. So rather than use MiniDFSCluster, inject metrics directly into the context and check that they are correctly received by the socket listener. 2. Move the test to the HDFS project as an integration test. Ideally we'd do both, since 1 is a unit test for Ganglia, and 2 tests the integration of metrics and HDFS.
          Hide
          Brian Bockelman added a comment -

          Hey Tom,

          I'm a bit confused on what to do next. Are you suggesting that we mark this as WONT FIX and re-open for HDFS? It's been about 9 months since I submitted the patch, so I'm mentally a bit behind.

          Brian

          Show
          Brian Bockelman added a comment - Hey Tom, I'm a bit confused on what to do next. Are you suggesting that we mark this as WONT FIX and re-open for HDFS? It's been about 9 months since I submitted the patch, so I'm mentally a bit behind. Brian
          Hide
          Tom White added a comment -

          Are you suggesting that we mark this as WONT FIX and re-open for HDFS?

          No, I think writing a unit test for GangliaContext31 (that doesn't use MiniDFSCluster) is needed to commit this. At a later point we could then follow up with a separate HDFS issue that has the test using MiniDFSCluster.

          Show
          Tom White added a comment - Are you suggesting that we mark this as WONT FIX and re-open for HDFS? No, I think writing a unit test for GangliaContext31 (that doesn't use MiniDFSCluster) is needed to commit this. At a later point we could then follow up with a separate HDFS issue that has the test using MiniDFSCluster.
          Hide
          ryan rawson added a comment -

          hey can we get this into 0.20.x ? This is a super benign patch and people are hurting without it.

          Show
          ryan rawson added a comment - hey can we get this into 0.20.x ? This is a super benign patch and people are hurting without it.
          Hide
          Thomas Koch added a comment -

          It would be awesome, if somebody could provide a patch for 0.20, then the upcomming Debian Squezze will get this patch and work together with the version of Ganglia in Debian Squeeze. I'm not enough into any of both to work on this.

          Show
          Thomas Koch added a comment - It would be awesome, if somebody could provide a patch for 0.20, then the upcomming Debian Squezze will get this patch and work together with the version of Ganglia in Debian Squeeze. I'm not enough into any of both to work on this.
          Hide
          Tom White added a comment -

          I've regenerated the patch without the test, to be added in HDFS-1493. If there are no objections to the approach I'd like to commit this, since the fix is long overdue.

          Show
          Tom White added a comment - I've regenerated the patch without the test, to be added in HDFS-1493 . If there are no objections to the approach I'd like to commit this, since the fix is long overdue.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12459227/HADOOP-4675.patch
          against trunk revision 1032730.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/86//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/86//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/86//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12459227/HADOOP-4675.patch against trunk revision 1032730. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/86//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/86//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/86//console This message is automatically generated.
          Hide
          Tom White added a comment -

          I've just committed this. Thanks Brian and everyone who helped out!

          Show
          Tom White added a comment - I've just committed this. Thanks Brian and everyone who helped out!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #420 (See https://hudson.apache.org/hudson/job/Hadoop-Common-trunk-Commit/420/)

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #420 (See https://hudson.apache.org/hudson/job/Hadoop-Common-trunk-Commit/420/ )
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk #509 (See https://hudson.apache.org/hudson/job/Hadoop-Common-trunk/509/)
          HADOOP-4675. Current Ganglia metrics implementation is incompatible with Ganglia 3.1. Contributed by Brian Bockelman.

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk #509 (See https://hudson.apache.org/hudson/job/Hadoop-Common-trunk/509/ ) HADOOP-4675 . Current Ganglia metrics implementation is incompatible with Ganglia 3.1. Contributed by Brian Bockelman.

            People

            • Assignee:
              Brian Bockelman
              Reporter:
              Brian Bockelman
            • Votes:
              6 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development