Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
SystemDS 2.1
Description
Cleaned up/improved pull request for initial CUDA codegen. This includes code cleanup and grouping the changes into a few commits (original PR had 60+ commits and was hard to review)
In terms of functionality, this PR will include:
- CUDA code reorganization
- JNI parts to launch generated CUDA kernels
- SPOOF compiler extensions
- CPlan templates
- Runtime instruction for dense input
- Code generation for the SpoofCellwise template
Attachments
Issue Links
- is blocked by
-
SYSTEMDS-2692 move cuda codebase
- Resolved