Uploaded image for project: 'Apache Whirr (retired)'
  1. Apache Whirr (retired)
  2. WHIRR-238

Scaling Monitor/Coordinator

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: core
    • Labels:
      None

      Description

      From the mailing list:

      General idea:
      Add an elastic scaling monitor and coordinator, i.e. a whirr process that would be running on some or all of the nodes that:

      • would collect load metrics (both generic and specific to each application)
      • would feed them through an elastic decision making engine (also specific to each application as it depends on the specific metrics)
      • would then act on those decisions by either expanding or contracting the cluster.

      Some specifics:

      • it must not be completely distributed, i.e. it can have a specific assigned node that will monitor/coordinate but this node must not be fixed, i.e. it could/should change if the previous coordinator leaves the cluster.
      • each application would define the set of metrics that it emits and use a local monitor process to feed them to the coordinator.
      • the monitor process should emit some standard metrics (Disk I/O, CPU Load, Net I/O, memory)
      • the coordinator would have a pluggable decision engine policy also defined by the application that would consume metrics and make a decision.
      • whirr would take care of requesting/releasing nodes and adding/removing them from the relevant services.

      Some implementation ideas:

      • it could tun on top of zookeeper. zk is already a requirement for several services and would allow to reliably store coordinator state so that another node can pickup if the previous coordinator leaves the cluster.
      • it could use Avro to serialize/deserialize metrics data
      • it should be optional, i.e. simply another service that the whirr cli starts
      • it would also be nice to have a monitor/coordinator web page that would display metrics and view cluster status in an aggregated view.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                dr-alves David Alves
              • Votes:
                2 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated: