Affects Version/s: 0.10.0
Fix Version/s: None
We are using Grafana/Influxdb for metrics and current Samza's model does not fit it particularly well.
Influxdb recently introduced so called "tags" and Grafana UI offers gret value when using them. The idea is to keep metric name very simple, for example cpu.use, and supply the measure with tags, for example
From what I can read, OpenTSDB use tags too.
Having tags instead of long metric names is much more convenient and in some cases the only way to perform some desired operations. For example, I want to have an alert for throughput of samza job. With tags encoded in metric name it is impossible because I would have to have a list of all machine names and samza job names in influxdb select statement, and even after that, there is no way to group them properly. With tags it is as simple as SELECT ... GROUP BY [[tag_host]],[[tag_samza_job_name]]. You can add new machines to the cluster and jobs to yarn, and they will appear with zero configuration effort in your metrics.
Currently, I partially mitigated the issue by ripping out 1st part of metric name (dot-separated parts) and making it "samza-src" tag, with the assumption that it is going to be container name. But in many metrics, partition number is encoded as part of metric name too. Its location is not consistent and not all metrics have it, I can not build alerting system on top of samza metrics.
Change samza internal metrics to use tags (string key-value pairs) and leave the job of constructing metric name to the output metric plugin.
This would allow to preserve backward compatibility and JMX reporter would construct metric name the same it is today, but Influxdb plugin would not modify the name and add list of tags to the measure.