Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
For multi-process training, each process should know its own logic process id, and also know the physical host:port of all logic processes.
This feature should be support by ClusterRuntime component.
Before a process start its job, it will first register its physical host:port, and get a unique process id from ClusterRuntime.
When the process need to communicate with other processes, it will get address from the ClusterRuntime as well.