Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-9235

Add per-Process event queue counters in libprocess.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Accepted
    • Major
    • Resolution: Unresolved
    • None
    • None
    • libprocess, metrics
    • None

    Description

      Currently, a few Processes have one-off event queue size metrics computed using PullGauges. This approach has several known disadvantages:

      • Getting event queue size metrics for a Process requires changing code / re-compiling.
      • The use of a pull gauge which dispatches onto the Process means it slows down metrics responses, as well as counts the queue size after the queue is flushed of all messages that arrived before the pull gauge dispatch (see MESOS-8914).
      • The use of a single "size" metric means that one cannot observe the overall enqueue and dequeue throughput.

      These can be replaced by introducing first-class support in libprocess for event queue metrics. For queue size / throughput, we can take the following approach:

      • Use configuration to opt-in to metrics for Processes of interest. E.g. specify "master,allocator" to enable metrics for those Processes.
      • Expose a pair of counters for "enqueued" and "dequeued" messages. Size of the queue can also be calculated by the user by subtracting the two values. For better usability, we could expose size as a pull gauge that subtracts the two values (prone to racing) or inspects the queue size directly without a trip through the queue.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              bmahler Benjamin Mahler
              Benjamin Mahler Benjamin Mahler
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: