Uploaded image for project: 'SystemDS'
  1. SystemDS
  2. SYSTEMDS-1561

Improve constant folding during compilation

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • SystemML 0.15
    • None
    • None

    Description

      In our `nn` library, our convolution and pooling layers have to pass around the spatial dimensions (height and width) of the images that are stretched out into rows of the input/output matrices. These output dimensions are computed within the forward functions of the above layers as small scalar equations. From a mathematical standpoint, these sizes can be determined at compile time, and it is nice to have these size equations in DML (v.s. hiding them inside the engine within built-in functions). However, we do not currently evaluate these expressions during compilation, and thus we are left with unknown sizes even during recompilation. This naturally leads to max memory estimates and thus often leads to unnecessary distributed runtime ops rather than simple CP ones.

      I have two related scenarios for which this is a problem. They both involve the Houtc1 & Woutc1 values that are returned from a `conv2d::forward(...)` function. These represent the spatial dimensions of the volume with each of the rows of the output outc1 of the function, and the third dimension is F1. Thus, outc1 has a number of columns equal to F1*Houtc1*Wouc1.

      In the first scenario (scenario1.py), a random matrix doutc1 is created that should have the same dimensions as outc1. For the columns, if I use cols=ncol(outc1) in this rand statement, the size will be propagated and CP ops will be compiled and run. I I instead use cols=F1*Houtc1*Woutc1, the size will forever be unknown, even during recompilation, and thus Spark ops will be compiled and run. I have included the recompile hops plan (scenario1_plan.txt).

      In the second scenario (scenario2.py), a max_pool2d::forward(...) function is inserted after the conv2d::forward(...) function that requires the Houtc1 and Woutc1 variables to be supplied as arguments. Since those latter variables are not executed during compilation time, the max pooling sizes remain unknown, even during recompilation, and thus Spark ops will be compiled and run. I have included the recompile hops plan (scenario2_plan.txt).

      We should either improve or fix our constant folding rewrites so that these scenarios are fixed, as they are necessary for performant deep learning applications. Note too that this issue will be present in other non-deep learning scenarios as well.

      Mailing list thread: https://www.mail-archive.com/dev@systemml.incubator.apache.org/msg01657.html

      Attachments

        1. scenario2.py
          2 kB
          Mike Dusenberry
        2. scenario2_plan.txt
          5 kB
          Mike Dusenberry
        3. scenario1.py
          2 kB
          Mike Dusenberry
        4. scenario1_plan.txt
          5 kB
          Mike Dusenberry

        Issue Links

          Activity

            People

              dusenberrymw Mike Dusenberry
              dusenberrymw Mike Dusenberry
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: