Uploaded image for project: 'Singa'
  1. Singa
  2. SINGA-131

Implement and optimize hybrid training using both CPU and GPU

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None

    Description

      We discussed with researchers from Stanford on implementing hybrid training before
      http://mail-archives.apache.org/mod_mbox/singa-dev/201507.mbox/%3CCAJz0iLsd5iSCqqVU4QHLKzMO2o%2BFt-40kN8RgWkYhDn%3D6Qqqbw%40mail.gmail.com%3E.
      Now with the GPU training supported, we can move on to this feature.

      The distributed training framework is natural for hybrid training with CPU and GPU. The first n workers would be assigned with GPU cards (n is the number of cards configured by users), and the rest workers would run on CPU.

      Some code may need updates and optimization to consider the memory transferring between GPU workers and CPU workers. Most of them is in worker.cc, param.cc and stub.cc.

      Automatically Tuning the workload among GPU and CPU could be designed and implemented in this ticket or a new ticket.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            wangwei.cs wangwei
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 336h
                336h
                Remaining:
                Remaining Estimate - 336h
                336h
                Logged:
                Time Spent - Not Specified
                Not Specified

                Slack

                  Issue deployment