[SPARK-1485] Implement AllReduce - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Critical
Resolution: Won't Fix
Affects Version/s: None
Fix Version/s: None
Component/s: MLlib
Labels:
None

Description

The current implementations of machine learning algorithms rely on the driver for some computation and data broadcasting. This will create a bottleneck at the driver for both computation and communication, especially in multi-model training. An efficient implementation of AllReduce (or AllAggregate) can help free the driver:

allReduce(RDD[T], (T, T) => T): RDD[T]

This JIRA is created for discussing how to implement AllReduce efficiently and possible alternatives.

Attachments

Issue Links

is related to

SPARK-24374 SPIP: Support Barrier Execution Mode in Apache Spark

Resolved

SPARK-2174 Implement treeReduce and treeAggregate

Resolved

links to

[Github] Pull Request #506 (mengxr)

Activity

People

Assignee:: Xiangrui Meng

Reporter:: Xiangrui Meng

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 14/Apr/14 07:12

Updated:: 04/Jun/18 19:54

Resolved:: 10/Jul/14 01:23