Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-4595

Add support for newest pre-defined Perf events to PerfEventIsolator

    XMLWordPrintableJSON

Details

    • Task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • containerization
    • None

    Description

      Currently, Perf Event Isolator is able to monitor all (specified in --perf_events=...) Perf Events, but it can map only part of them in ResourceUsage.proto (to be more exact in PerfStatistics.proto )

      Since the last time PerfStatistics.proto was updated, list of supported events expanded much and is growing constantly. I have created some comparison table:

      Events type Num of matched events in PerfStatistics vs perf 4.3.3 perf 4.3.3 events
      HW events 8 8
      SW events 9 10
      HW cache event 20 20
      Kernel PMU events 0 37
      Tracepoint events 0 billion (:

      For advance analysis (e.g during Oversubscription in QoS Controller) having support for additional events is crucial. For instance in Serenity we based some of our revocation algorithms on the new CMT feature which gives additional, useful event called llc_occupancy.

      I think we all agree that it would be great to support more (or even all) perf events in Mesos PerfEventIsolator (:


      Let's start a discussion over the approach. Within this task we have three issues:

      1. What events do we want to support in Mesos?
        1. all?
        2. only add Kernel PMU Events?

          I don't have a strong opinion on that, since i have never used Tracepoint events. We currently need PMU events.
      2. How to add new (or modify existing) events in mesos.proto?
        We can distinguish here 3 approaches:
        1. Add new events statically in PerfStatistics.proto as separate optional fields. (like it is currently)
        2. Instead of optional fields in PerfStatistics.proto message we could have a key-value map (something like labels in other messages) and feed it dynamically in PerfEventIsolator
        3. We could mix above approaches and just add mentioned map to existing PerfStatistics.proto for additional events (:

          IMO: Approaches 1) is somehow explicit - users can view what events to expect (although they are parsed in a different manner e.g "-" to "_"), but we would end with a looong message and a lot of copy-paste work. And we have to maintain that!
          Approach 2 & 3 are more elastic, and we don't have problem mentioned in the issue below (: And we always support all perf events in all kernel versions (:
          IMO approaches 2 & 3 are the best.
      3. How to support different naming format? For instance intel_cqm/llc_occupancy/ with "/" in name or migrate:mm_migrate_pages with ":". I don't think it is possible to have these as the field names in .proto syntax

      Currently, approach #3 is chosen. (Adding dynamic map to existing PerfStatistics.proto for additional events specified in --perf_events=...)

      Attachments

        Activity

          People

            Bartek Plotka Bartek Plotka
            Bartek Plotka Bartek Plotka
            Niklas Quarfot Nielsen Niklas Quarfot Nielsen
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: