Description
This is a project to make it easier for PySpark users to distribute PyTorch code using PySpark. The corresponding Design Document can give more context. This was a project determined by the Databricks ML Training Team; please reach out to gurwls223 (Spark-side) or erithwik for more context.
Attachments
Issue Links
1.
|
Implement Baseline API Code | Resolved | Unassigned | |
2.
|
Implement functionality for training a PyTorch file locally | Resolved | Rithwik Ediga Lakhamsani | |
3.
|
Implement functionality for training a PyTorch file on the executors | Resolved | Rithwik Ediga Lakhamsani | |
4.
|
Implement logging from the executor nodes | Resolved | Rithwik Ediga Lakhamsani | |
5.
|
Implement training functions as input | Resolved | Rithwik Ediga Lakhamsani | |
6.
|
Implement support for PyTorch Lightning | Resolved | Unassigned | |
7.
|
Add Integration Tests | Resolved | Rithwik Ediga Lakhamsani | |
8.
|
Change API so that the user doesn't have to explicitly set pytorch-lightning | Resolved | Unassigned | |
9.
|
Address General Fixes | Open | Unassigned | |
10.
|
Add Instrumentation | Resolved | Unassigned |