Follow on from https://issues.apache.org/jira/browse/MADLIB-927
which supports one distance function. This JIRA is to
add additional distance metrics. The model is follow is
TEXT, default: squared_dist_norm2'. The name of the function to use to calculate the distance between data points.
The following distance functions can be used (computation of barycenter/mean in parentheses):
dist_norm1: 1-norm/Manhattan (element-wise median [Note that MADlib does not provide a median aggregate function for support and performance reasons.])
dist_norm2: 2-norm/Euclidean (element-wise mean)
squared_dist_norm2: squared Euclidean distance (element-wise mean)
dist_angle: angle (element-wise mean of normalized points)
dist_tanimoto: tanimoto (element-wise mean of normalized points )
user defined function with signature DOUBLE PRECISION x, DOUBLE PRECISION y -> DOUBLE PRECISION
and also check of there are other distance functions under
that might make sense to include while you are at it, in addition to the ones listed above
(2) Add an option for weighted average in the voting.
- this requirement moved to a separate JIRA: https://issues.apache.org/jira/browse/MADLIB-1181