Details
Description
DataFrame.describe should return a DataFrame with summary statistics.
def describe(cols: String*): DataFrame
If cols is empty, then run describe on all numeric columns.
The returned DataFrame should have 5 rows (count, mean, stddev, min, max) and n + 1 columns. The 1st column is the name of the aggregate function, and the next n columns are the numeric columns of interest in the input DataFrame.
Similar to Pandas (but removing percentile since accurate percentiles are too expensive to compute for Big Data)
In [19]: df.describe() Out[19]: A B C D count 6.000000 6.000000 6.000000 6.000000 mean 0.073711 -0.431125 -0.687758 -0.233103 std 0.843157 0.922818 0.779887 0.973118 min -0.861849 -2.104569 -1.509059 -1.135632 max 1.212112 0.567020 0.276232 1.071804