Details
-
Task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Attachments
1.
|
Improve the performance of copyUpperToLowerTriangleDense kernel | Open | Unassigned | |
2.
|
Chose the block dimensions of custom kernels using CUDA's occupancy API | Open | Unassigned | |
3.
|
Add custom kernels for dense binary elementwise operations | Open | Unassigned | |
4.
|
Add custom kernels for sparse binary elementwise operations | Open | Unassigned | |
5.
|
Add custom kernel for dense ternary aggregate instruction | Open | Unassigned | |
6.
|
Add custom kernel for sparse ternary aggregate instruction | Open | Unassigned | |
7.
|
Add support for aggregate unary operations on GPU | Open | Nakul Jindal | |
8.
|
Implement custom kernel for append operation | Open | Unassigned | |
9.
|
Implement custom kernel for reshape operation | Open | Unassigned | |
10.
|
Add bias_multiply operator | Closed | Niketan Pansare | |
11.
|
Add custom kernel for sparse matrix dense vector multiplication | Open | Unassigned | |
12.
|
Add custom kernel for sparse matrix dense matrix multiplication | Open | Unassigned |