Uploaded image for project: 'Singa'
  1. Singa
  2. SINGA-57

Improve Distributed Hogwild

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None

    Description

      The implementation SINGA-8 of distributed Hogwild uses the stub thread to monitor the network bandwidth. When the network has >0 bandwidth, the stub sends a sync reminder msg to a server, which would trigger the server to sync one param slice with other server groups.

      The code is messy due to the monitoring of network bandwidth and processing the sync reminder message. Another problem is that the reminder message may not be generated frequently. Because it is generated only when the router times out. If the worker and server run very fast that the router rarely times out, then the sync reminder message cannot be sent. In contrast, if the router times out frequently, many reminder messages would be generated.

      This ticket improves the implementation by fixing the frequency of synchronization between server groups. A server sends a sync message for a Param (slice) for every sync_freq updates to the server group that masters/maintains the Param.

      Attachments

        Activity

          People

            Unassigned Unassigned
            wangwei.cs wangwei
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: