Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
In our `nn` library, our convolution and pooling layers have to pass around the spatial dimensions (height and width) of the images that are stretched out into rows of the input/output matrices. These output dimensions are computed within the forward functions of the above layers as small scalar equations. From a mathematical standpoint, these sizes can be determined at compile time, and it is nice to have these size equations in DML (v.s. hiding them inside the engine within built-in functions). However, we do not currently evaluate these expressions during compilation, and thus we are left with unknown sizes even during recompilation. This naturally leads to max memory estimates and thus often leads to unnecessary distributed runtime ops rather than simple CP ones.
I have two related scenarios for which this is a problem. They both involve the Houtc1 & Woutc1 values that are returned from a `conv2d::forward(...)` function. These represent the spatial dimensions of the volume with each of the rows of the output outc1 of the function, and the third dimension is F1. Thus, outc1 has a number of columns equal to F1*Houtc1*Wouc1.
In the first scenario (scenario1.py), a random matrix doutc1 is created that should have the same dimensions as outc1. For the columns, if I use cols=ncol(outc1) in this rand statement, the size will be propagated and CP ops will be compiled and run. I I instead use cols=F1*Houtc1*Woutc1, the size will forever be unknown, even during recompilation, and thus Spark ops will be compiled and run. I have included the recompile hops plan (scenario1_plan.txt).
In the second scenario (scenario2.py), a max_pool2d::forward(...) function is inserted after the conv2d::forward(...) function that requires the Houtc1 and Woutc1 variables to be supplied as arguments. Since those latter variables are not executed during compilation time, the max pooling sizes remain unknown, even during recompilation, and thus Spark ops will be compiled and run. I have included the recompile hops plan (scenario2_plan.txt).
We should either improve or fix our constant folding rewrites so that these scenarios are fixed, as they are necessary for performant deep learning applications. Note too that this issue will be present in other non-deep learning scenarios as well.
Mailing list thread: https://www.mail-archive.com/dev@systemml.incubator.apache.org/msg01657.html
Attachments
Attachments
Issue Links
- is related to
-
SYSTEMDS-1566 Possible regression from 0.13 -> 0.14 for MNIST LeNet script
- Closed
-
SYSTEMDS-1466 Update `convnet.dml` to use distributed SGD.
- In Progress
- relates to
-
SYSTEMDS-540 Deep Learning
- In Progress
-
SYSTEMDS-618 Deep Learning DML Library
- In Progress
-
SYSTEMDS-1185 SystemML Breast Cancer Project
- Resolved
-
SYSTEMDS-1575 DataType Change Test Failure
- Closed
-
SYSTEMDS-1554 IPA Scalar Transient Read Replacement
- Closed