[SPARK-2335] k-Nearest Neighbor classification and regression for MLLib - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Minor
Resolution: Duplicate
Affects Version/s: None
Fix Version/s: None
Component/s: MLlib
Labels:
- clustering
- features

Description

The k-Nearest Neighbor model for classification and regression problems is a simple and intuitive approach, offering a straightforward path to creating non-linear decision/estimation contours. It's downsides – high variance (sensitivity to the known training data set) and computational intensity for estimating new point labels – both play to Spark's big data strengths: lots of data mitigates data concerns; lots of workers mitigate computational latency.

We should include kNN models as options in MLLib.

Attachments

Issue Links

duplicates

SPARK-2336 Approximate k-NN Models for MLLib

Resolved

is depended upon by

SPARK-2336 Approximate k-NN Models for MLLib

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Brian Gawalt

Shepherd:: Ashutosh Trivedi

Votes:: 5 Vote for this issue

Watchers:: 20 Start watching this issue

Dates

Created:: 01/Jul/14 18:20

Updated:: 16/Jan/16 13:38

Resolved:: 16/Jan/16 13:38