Hadoop Common
  1. Hadoop Common
  2. HADOOP-132

An API for reporting performance metrics

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.2.0
    • Fix Version/s: 0.2.0
    • Component/s: None
    • Labels:
      None

      Description

      I'd like to propose adding an API for reporting performance metrics. I will post some javadoc as soon as I figure out how to do so. The idea is for the API to be sufficiently abstract that various different implementations can be plugged in. In particular, there would be one that just writes the metric data to a file, and another that sends metrics to Ganglia. It would also be possible to plug in an implementation that can support high-frequency (say, per-second) sending of fairly large amounts of data (up to hundreds of metrics) across the network.

      I'd be very interested in people's thoughts about what the requirements should be for such an API.

      • David Bowen
      1. metrics.patch
        77 kB
        David Bowen
      2. javadoc.tgz
        24 kB
        David Bowen
      3. javadoc.tgz
        37 kB
        David Bowen

        Activity

        Hide
        David Bowen added a comment -

        Here is the proposed javadoc. Comments on the API are welcome also.

        Show
        David Bowen added a comment - Here is the proposed javadoc. Comments on the API are welcome also.
        Hide
        stack added a comment -

        Looks good. Would be sweet having job progress show in Ganglia. Metrics would run on every slave?

        I'd suggest that MetricsRecord and guages need descriptions (Adding to Record would be easy enough – harder to guage going by your API so far). Would come in handy in an admin page listing available records and their guages. if they had descriptions, then Record could be roughly mapped to jmx MBean and gauge to jmx Attribute.

        Show
        stack added a comment - Looks good. Would be sweet having job progress show in Ganglia. Metrics would run on every slave? I'd suggest that MetricsRecord and guages need descriptions (Adding to Record would be easy enough – harder to guage going by your API so far). Would come in handy in an admin page listing available records and their guages. if they had descriptions, then Record could be roughly mapped to jmx MBean and gauge to jmx Attribute.
        Hide
        Doug Cutting added a comment -

        A nit: the code should be in package org.apache.hadoop.metrics, not org.hadoop.metrics.

        Show
        Doug Cutting added a comment - A nit: the code should be in package org.apache.hadoop.metrics, not org.hadoop.metrics.
        Hide
        David Bowen added a comment -

        Here is an updated API, incorporating the feedback I've received so far. The main
        changes are

        (1) Ganglia support is included - this doesn't really affect the API, but the javadoc describes the configuration options which might be of interest.

        (2) The SPI (service provider interface) is now in a separate package, and it does most of the implementation work so that implementation packages (like the file and ganglia sub-packages) can be quite small.

        (3) The unbuffered option has been removed. This means that developers can use the API freely in inner-loops without the concern that metric reporting might be configured to emit data on every update. The data will just be stored in an internal table, and the table will only be sent periodically to the metrics server.

        Show
        David Bowen added a comment - Here is an updated API, incorporating the feedback I've received so far. The main changes are (1) Ganglia support is included - this doesn't really affect the API, but the javadoc describes the configuration options which might be of interest. (2) The SPI (service provider interface) is now in a separate package, and it does most of the implementation work so that implementation packages (like the file and ganglia sub-packages) can be quite small. (3) The unbuffered option has been removed. This means that developers can use the API freely in inner-loops without the concern that metric reporting might be configured to emit data on every update. The data will just be stored in an internal table, and the table will only be sent periodically to the metrics server.
        Hide
        Doug Cutting added a comment -

        +1 Looks good to me!

        I think there's a typo in the overview, where you should have setMetric() you instead have setGauge().

        Show
        Doug Cutting added a comment - +1 Looks good to me! I think there's a typo in the overview, where you should have setMetric() you instead have setGauge().
        Hide
        David Bowen added a comment -

        OK, here is the source code, in the form of a patch.

        NOTE: this requires javac.version 1.5. I.e. you currently have to say "ant -Djavac.version=1.5". Is this acceptable in general for Hadoop development?
        The 1.5 (or 5.0 as they call it) for-loop syntax is very convenient.

        A couple of changes since yesterday's javadoc:

        (1) there is now a NullContext which is used by default and does nothing; formerly the FileContext was used by default and it wrote to stdout if a file was not specified.

        (2) there is now an Updater interface for the callbacks, instead of using Runnable. This is so that the MetricsContext can be passed as an argument to the callback.

        Show
        David Bowen added a comment - OK, here is the source code, in the form of a patch. NOTE: this requires javac.version 1.5. I.e. you currently have to say "ant -Djavac.version=1.5". Is this acceptable in general for Hadoop development? The 1.5 (or 5.0 as they call it) for-loop syntax is very convenient. A couple of changes since yesterday's javadoc: (1) there is now a NullContext which is used by default and does nothing; formerly the FileContext was used by default and it wrote to stdout if a file was not specified. (2) there is now an Updater interface for the callbacks, instead of using Runnable. This is so that the MetricsContext can be passed as an argument to the callback.
        Hide
        Doug Cutting added a comment -

        We should still stick with JDK 1.4. Most developers are now using JDK 1.5, but many folks still use JDK 1.4 in production. So, if it's not too much pain, please back-port this to 1.4. Thanks!

        Show
        Doug Cutting added a comment - We should still stick with JDK 1.4. Most developers are now using JDK 1.5, but many folks still use JDK 1.4 in production. So, if it's not too much pain, please back-port this to 1.4. Thanks!
        Hide
        David Bowen added a comment -

        OK, here is an updated version of the patch with the code made uglier and less robust so as to accommodate people who perversely insist on running with JDK 1.4. Also, I added a note to CHANGES.txt.

        Show
        David Bowen added a comment - OK, here is an updated version of the patch with the code made uglier and less robust so as to accommodate people who perversely insist on running with JDK 1.4. Also, I added a note to CHANGES.txt.
        Hide
        David Bowen added a comment -

        Woops, forgot to update CHANGES.txt before modifying it.

        Show
        David Bowen added a comment - Woops, forgot to update CHANGES.txt before modifying it.
        Hide
        Doug Cutting added a comment -

        I just committed this. Thanks, David!

        Show
        Doug Cutting added a comment - I just committed this. Thanks, David!

          People

          • Assignee:
            Unassigned
            Reporter:
            David Bowen
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development