Details
-
Task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Currently we have a need for GPU scheduling on our YARN clusters to support deep learning workloads. However, our main production clusters are running older versions of branch-2 (2.7 in our case). To prevent supporting too many very different hadoop versions across multiple clusters, we would like to backport the resource types/resource profiles feature to branch-2, as well as the GPU specific support.
We have done a trial backport of YARN-3926 and some miscellaneous patches in YARN-7069 based on issues we uncovered, and the backport was fairly smooth. We also did a trial backport of most of YARN-6223 (sans docker support).
Regarding the backports, perhaps we can do the development in a feature branch and then merge to branch-2 when ready.