Charles Lin and I are working on our IBM SJSU masters project on "Pattern recognition of Hadoop generated metrics".
The purpose of the project is to use libsvm to predict the health of the cluster.
The scope of the project includes:
1) gathering large scale data set of metrics for healthy and unhealthy clusters
2) use #1 and libsvm to generate training model
3) periodic collection of metrics and comparing against training model using libsvm to predict the cluster health
a) if unhealthy, send email notification to system administrator