Uploaded image for project: 'Singa'
  1. Singa
  2. SINGA-19

Slice large Param objects for load-balance

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None

    Description

      Some Param objects in deep learning models are much larger than other Param objects. For example, a weight matrix is usually 100 times larger than a bias vector. The difference in Param size causes two problems,

      1. if there are multiple servers in one server group, then the servers may be assigned different number of parameters to update.
      2. if there are multiple server groups, e.g., in distributed Hogwild framework, then these server groups may be assigned different number of parameters to maintain.

      This ticket its to slice large Param objects to solve the load-balance problem. The slicing operations are done in the stub thread to make them transparent to both workers and servers.

      Attachments

        Activity

          People

            wangwei.cs wangwei
            wangwei.cs wangwei
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: