Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
SystemML 0.9
-
None
Description
The parfor util script sample.dml fails with dimension mismatch in special cases, where the remote memory budget of map/reduce tasks is larger than the driver memory budget and the permutation matrix multiplication would be compiled to MR in local parfor but CP in remote parfor execution.
In these cases, we trigger a forced recompile to CP which internally tries to reduce the overhead by recompiling only dags where the runtime plan contains MR instructions. This selective recompilation in invalid with permutation matrix multiplications that stretch two subsequent dags and the first dag does not necessarily contain MR instructions.
Since meanwhile, the overhead of recompiling average dags (50-100 operators) is less than 1ms, we should always recompile the entire parfor body program in these cases.
As a related note: Since we now support removeEmpty with selection vectors, we should rewrite these permutation matrix multiplications to remove empty w/ selection which is equivalent from a runtime perspective but would simplify debugging in comparison to the current multi-dag rewrite.