Details
-
Improvement
-
Status: To Do
-
Major
-
Resolution: Unresolved
-
None
Description
This improvement is based on the ideas proposed in this SysML 2019 paper: https://anandj.in/wp-content/uploads/sysml.pdf
The key idea is, in a data parallel training scenario, synchronize parameters of each layer as packets of finer granularity based on its priority (defined by the layer index). This way of parameter synchronization provides better network utilization and thus improve the training throughput.