Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.6
    • Fix Version/s: 0.6
    • Labels:
      None

      Description

      Improving performance of S4 applications requires metrics that can be analyzed.

      Some metrics suggested in S4-65 are:

      • event rate
      • average/median event processing time
      • event queue sizes
      • number of PE instances
      • number of processed events ( maybe within a sliding window)
      • exceptions during event processing

      We'd also add things like:

      • shedding stats
      • memory usage
      • CPU load
      • checkpointing: effective checkpoints, rejected ones
      • etc...

      Ideally, we'd be able to have different kinds of metrics, including historical ones.

      We also want to be able to expose those metrics through JMX, and probably through other monitoring systems such as nagios.

      A library that could be useful here is codahale's metrics http://metrics.codahale.com/ , but there are alternatives that could be evaluated as well.

        Issue Links

          Activity

          Hide
          Matthieu Morel added a comment -

          We added registering of various kinds of statistics about S4 in S4-95, which has already been merged into dev branch.

          In addition, I added a simple mechanism to enable reporting to the console or to csv files, uploaded to branch S4-86.

          The `s4.metrics.config` parameter enables periodic dumps of aggregated statistics to the console or to files in csv format. This parameter is specified as an application parameter, and must match the following regular expression:

          (csv:.|console)\d):(DAYS|HOURS|MICROSECONDS|MILLISECONDS|MINUTES|NANOSECONDS|SECONDS)

          Examples:

          1. dump metrics to csv files to /path/to/directory every 10 seconds
            csv:file://path/to/directory:10:SECONDS
          1. dump metrics to the console every minute
            console:1:MINUTES
          Show
          Matthieu Morel added a comment - We added registering of various kinds of statistics about S4 in S4-95 , which has already been merged into dev branch. In addition, I added a simple mechanism to enable reporting to the console or to csv files, uploaded to branch S4-86 . The `s4.metrics.config` parameter enables periodic dumps of aggregated statistics to the console or to files in csv format. This parameter is specified as an application parameter, and must match the following regular expression: (csv:. |console) \d ):(DAYS|HOURS|MICROSECONDS|MILLISECONDS|MINUTES|NANOSECONDS|SECONDS) Examples: dump metrics to csv files to /path/to/directory every 10 seconds csv: file://path/to/directory:10:SECONDS dump metrics to the console every minute console:1:MINUTES
          Hide
          Matthieu Morel added a comment -

          patch available in branch S4-86, commit bfe0498ea2c0e4cb66a80ec5bdb87b19b9a908f9 for easy configuration of metrics output

          Show
          Matthieu Morel added a comment - patch available in branch S4-86 , commit bfe0498ea2c0e4cb66a80ec5bdb87b19b9a908f9 for easy configuration of metrics output
          Hide
          Daniel Gómez Ferro added a comment -

          I found a couple of minor issues:

          • In S4Metrics line 86, we have to remove the extra call 'matcher.find()'
          • In ProcessingElement we are registering a per-PE timer (line 175) but we are not using it, since the calls are commented out (lines 441-470)
          Show
          Daniel Gómez Ferro added a comment - I found a couple of minor issues: In S4Metrics line 86, we have to remove the extra call 'matcher.find()' In ProcessingElement we are registering a per-PE timer (line 175) but we are not using it, since the calls are commented out (lines 441-470)
          Hide
          Matthieu Morel added a comment -

          Thanks for the comments! The extra find() comes from a last minute change before I uploaded the patch, to refactor some code in S4Metrics... bad idea!

          I also enabled the pe processing time with an option to disable it.

          Updated patch in branch S4-86, commit d6b89c3074082c5cbe1a7bec7169d5c63ebfcd5c

          Show
          Matthieu Morel added a comment - Thanks for the comments! The extra find() comes from a last minute change before I uploaded the patch, to refactor some code in S4Metrics... bad idea! I also enabled the pe processing time with an option to disable it. Updated patch in branch S4-86 , commit d6b89c3074082c5cbe1a7bec7169d5c63ebfcd5c
          Hide
          Daniel Gómez Ferro added a comment -

          Thanks for the changes Matthieu! I merged this on dev, commit 457844d283b722da69ae23d22162afcae2a6bc84

          Show
          Daniel Gómez Ferro added a comment - Thanks for the changes Matthieu! I merged this on dev, commit 457844d283b722da69ae23d22162afcae2a6bc84

            People

            • Assignee:
              Matthieu Morel
              Reporter:
              Matthieu Morel
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development