Description
In unsupervised mode of knn
http://madlib.apache.org/docs/latest/group__grp__knn.html
when `point_source` and `test_source` are the same data set, nearest neighbors is not reliably returning the 0 distance point as a nearest neighbor.
Could there a small neg issue here for a distance that is effectively 0 but shows up as neg epsilon?
Also, please assess if we can add a vector of distances to the output file:
Output Format The output of the KNN module is a table with the following columns: id INTEGER. The ids of test data points. test_column_name DOUBLE PRECISION[]. The test data points. prediction INTEGER. Label in case of classification, average value in case of regression. k_nearest_neighbours INTEGER[]. List of nearest neighbors, sorted closest to furthest from the corresponding test point. distance DOUBLE PRECISION[]. Distance sorted in the same order as the 'k_nearest_neighbours' array.