Description
Currently SINGA is able to run over a cluster of nodes using CPU and over a single node with multiple GPUs.
This ticket is going to extend SINGA to run over a GPU cluster.
The framework is applicable for such training environment.
We need to update the code for allocating the GPU workers on different nodes and for messaging passing between GPUs on different nodes (refer to SINGA-133).