[SPARK-7129] Add generic boosting algorithm to spark.ml - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Incomplete
Affects Version/s: None
Fix Version/s: None
Component/s: ML
Labels:
- bulk-closed

Description

The Pipelines API will make it easier to create a generic Boosting algorithm which can work with any Classifier or Regressor. Creating this feature will require researching the possible variants and extensions of boosting which we may want to support now and/or in the future, and planning an API which will be properly extensible.

In particular, it will be important to think about supporting:

multiple loss functions (for AdaBoost, LogitBoost, gradient boosting, etc.)
multiclass variants
multilabel variants (which will probably be in a separate class and JIRA)
For more esoteric variants, we should consider them but not design too much around them: totally corrective boosting, cascaded models

Note: This may interact some with the existing tree ensemble methods, but it should be largely separate since the tree ensemble APIs and implementations are specialized for trees.

Attachments

Issue Links

is required by

SPARK-3703 Ensemble learning methods

Resolved

relates to

SPARK-7409 Designing multilabel abstractions for spark.ml

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Joseph K. Bradley

Votes:: 0 Vote for this issue

Watchers:: 14 Start watching this issue

Dates

Created:: 24/Apr/15 19:17

Updated:: 21/May/19 04:12

Resolved:: 21/May/19 04:12