Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
This task is to add new matrix multiplication kernels for:
C = t(A) %*% B
C = A %*% t(B)
C = t(A) %*% t(B)
t(C) = A %*% B
t(C) = ... etc
That multiplies the matrices without transposing them in memory (aka allocating them).
The implementations should be added to the file:
src/main/java/org/apache/sysds/runtime/matrix/data/LibMatrixMult.java
There is no requirements for full integration (meaning that they should be used in DML)
but any implementation should be faster than multiplication with transpose allocation, if not it should default back to a version where the input is transposed (this can also be used for versions that are not implemented yet).
Tests for the methods should be put in component tests:
src/test/java/org/apache/sysds/test/component/matrixmult