Details
-
New Feature
-
Status: Resolved
-
Minor
-
Resolution: Duplicate
-
None
-
None
Description
The k-Nearest Neighbor model for classification and regression problems is a simple and intuitive approach, offering a straightforward path to creating non-linear decision/estimation contours. It's downsides – high variance (sensitivity to the known training data set) and computational intensity for estimating new point labels – both play to Spark's big data strengths: lots of data mitigates data concerns; lots of workers mitigate computational latency.
We should include kNN models as options in MLLib.
Attachments
Issue Links
- duplicates
-
SPARK-2336 Approximate k-NN Models for MLLib
- Resolved
- is depended upon by
-
SPARK-2336 Approximate k-NN Models for MLLib
- Resolved