[SINGA-8] Implement distributed Hogwild - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Fixed
Component/s: None
Labels:

Description

Generally, both the Downpour framework of Google Brain [1] and the Caffe's distributed Hogwild implementation are extensions of the shared memory Hogwild training. In this ticket, we refer to the second one.

In specific, each server group masters a subset of parameters (i.e., Param objects) when synchronizing with other server groups. It aggregates all updates for its subset and sends back (e.g., broadcast) the updated parameters back to all other server groups. The synchronization is conducted asynchronously. The frequency can be fixed in the first implementations. Finally, it should be tuned automatically to fully utilize the network bandwidth.

[1]J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, Q. V. Le, M. Z. Mao, M. Ranzato, A. W. Senior, P. A. Tucker, K. Yang, and A. Y. Ng. Large scale
distributed deep networks. In NIPS, pages 1232{1240, 2012.

Attachments

Activity

People

Assignee:: wangwei

Reporter:: wangwei

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 13/Jun/15 06:46

Updated:: 19/Aug/15 06:18

Resolved:: 19/Aug/15 06:18