We're currently graphing both mean and standard deviation of datanodes from that mean, using a script that parses the output of 'dfsadmin -report'. Our DFS cluster nodes all have the same amount of disk space, so you'd expect mean of individual datanodes to be the same as % DFS full, but it's not quite the same. Haven't yet looked into why this is so.

To directly answer Konstantin's question, the one line we're using is standard deviation.

I just committed this. Thanks Dmytro!