[SYSTEMDS-1752] Cache-conscious mmchain matrix multiply for wide matrices - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Task
Status: Closed
Priority: Major
Resolution: Done
Affects Version/s: None
Fix Version/s: SystemML 0.15
Component/s: None
Labels:
None

Description

The fused mmchain matrix multiply for patterns such as t(X) %% (w * (X %% v)) uses row-wise dotProduct and vectMultAdd operations, which works very well for the common case of tall&skinny matrices where individual rows fit into L1 cache. However, for graph and text scenarios with wide matrices this leads to cache trashing on the input and output vectors.

This task aims to generalize these dense and sparse operations to perform the computation in a cache-conscious manner when necessary, by accessing fragments of the input and output vector for groups of rows. For dense this is trivial to realize while for sparse it requires a careful determination of the block sizes according to the input sparsity.

Attachments

Issue Links

relates to

SYSTEMDS-913 Performance matrix-vector multiplication w/ tall rhs vector

Closed

Activity

People

Assignee:: Matthias Boehm

Reporter:: Matthias Boehm

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 08/Jul/17 07:57

Updated:: 09/Sep/17 02:09

Resolved:: 09/Jul/17 06:25