We should support computing correlations between columns in DataFrames with a simple API.
This could be a DataFrame feature:
Or it could be an MLlib feature:
(The first Statistics.corr option is more flexible, but it could cause trouble if a user tries to pass in 2 unzippable DataFrame columns.)
Note: R follow the latter setup. I'm OK with either.