Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Implemented
-
None
-
None
-
None
Description
Extend the prefetch instruction to support fetching intermediates from GPU to main memory. Place the prefetch operator to maximize host device parallelism.