Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
SystemDS 2.1
Description
Similarly to the compilation step, the runtime part of cuda codegen needs improvement by specializing to avoid conditionals while executing runtime instructions.
This improvement will refactor the SpoofCUDA runtime gpu instruction into the generic part preparing input from the cache and specific SpoofOperator derived classes per codegen template to handle template specific extras like temporary memory and output dimensions.
Error handling needs improvement by catching native exceptions and have the Java version of the operator executed as a backup.
The single/double precision floating point variants should be instantiated at startup time in the fashion of the interface CudaSupportFunctions.