[SINGA-19] Slice large Param objects for load-balance - ASF JIRA

Attach files

Attach Screenshot

Voters

Watch issue

Watchers

Create sub-task

Link

Clone

Update Comment Author

Replace String in Comment

Update Comment Visibility

Delete Comments

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Fixed
Component/s: None
Labels:
None

Description

Some Param objects in deep learning models are much larger than other Param objects. For example, a weight matrix is usually 100 times larger than a bias vector. The difference in Param size causes two problems,

1. if there are multiple servers in one server group, then the servers may be assigned different number of parameters to update.
2. if there are multiple server groups, e.g., in distributed Hogwild framework, then these server groups may be assigned different number of parameters to maintain.

This ticket its to slice large Param objects to solve the load-balance problem. The slicing operations are done in the stub thread to make them transparent to both workers and servers.

Attachments

Activity

Comment

This comment will be Viewable by All Users Viewable by All Users

Cancel

People

Assignee:: wangwei

Reporter:: wangwei

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 19/Jun/15 06:28

Updated:: 25/Jun/15 03:39

Resolved:: 25/Jun/15 03:39

Agile

View on Board

Slice large Param objects for load-balance

Details

Description

Attachments

Attachments

Activity

People

Dates

Agile

Slack

Issue deployment