Details

    • Epic Name:
      Code generation for operator fusion

      Description

      This epic aims to introduce code generation capabilities for automatic operator fusion, which helps to reduce the number of materialized intermediates, reduce the number of input scans, exploit sparsity, and reduce compute, while significantly reducing the development effort compared to hand-coded fused operators.

      For the 1.0 release, we will introduce code generation as an experimental feature, covering an extended version of SPOOF's code generator as described in the following paper:

      T. Elgamal, S. Luo, M. Boehm, A. V. Evfimievski, S. Tatikonda, B. Reinwald, P. Sen: SPOOF: Sum-Product Optimization and Operator Fusion for Large-Scale Machine Learning, CIDR, 2017
      http://cidrdb.org/cidr2017/papers/p3-elgamal-cidr17.pdf

        Issue Links

        1.
        Basic code generator Sub-task Closed Matthias Boehm
         
        2.
        Compiler integration codegen Sub-task Closed Matthias Boehm
         
        3.
        Runtime integration codegen Sub-task Closed Matthias Boehm
         
        4.
        Additional meta operator template: MultiAggregate Sub-task Closed Matthias Boehm
         
        5.
        Support compressed matrix blocks Sub-task Closed Matthias Boehm
         
        6.
        Hardening transfer of generated operators Sub-task Closed Matthias Boehm
         
        7.
        Size bounding plan cache Sub-task Closed Matthias Boehm
         
        8.
        Support spark codegen instructions w/ multiple RDD inputs Sub-task Closed Matthias Boehm
         
        9.
        Extended compiler: materialization decisions Sub-task Closed Matthias Boehm
         
        10. Disable counter-productive existing fused operators Sub-task Open Unassigned
         
        11.
        In-memory source code compilation Sub-task Closed Matthias Boehm
         
        12.
        Extended stats tool (code generation statistics) Sub-task Closed Matthias Boehm
         
        13.
        Extended explain tool (for generated java/byte code) Sub-task Closed Matthias Boehm
         
        14. Perftest benchmark codegen Sub-task Open Unassigned
         
        15.
        Removal unnecessary int-double casts Sub-task Closed Matthias Boehm
         
        16.
        Enable spoof instructions in parfor remote spark Sub-task Closed Matthias Boehm
         
        17.
        Codegen for existing cellwise fused operators Sub-task Closed Matthias Boehm
         
        18.
        Memory management temporary vector intermediates Sub-task Closed Matthias Boehm
         
        19.
        Rework codegen algorithm testcases (use current algorithms) Sub-task Closed Matthias Boehm
         
        20.
        Hardening sparse-safe check cellwise operations Sub-task Closed Matthias Boehm
         
        21.
        Generalization cellwise template (from sideways vectors to matrices) Sub-task Closed Matthias Boehm
         
        22.
        Generalization cellwise template (add mm as aggregation root) Sub-task Closed Matthias Boehm
         
        23.
        Simplify cplan construction algorithm Sub-task Closed Matthias Boehm
         
        24.
        Support right indexing in cellwise and rowaggregate templates Sub-task Closed Matthias Boehm
         
        25.
        Support for different plan selection policies Sub-task Closed Matthias Boehm
         
        26.
        Sparsity-exploiting cellwise template Sub-task Closed Matthias Boehm
         
        27.
        Cost model for candidate selection Sub-task Closed Matthias Boehm
         
        28.
        Support min/max/sumsq in cell templates w/ aggregation Sub-task Closed Matthias Boehm
         
        29.
        Fuse row aggregate w/ colvector output into cell template Sub-task Closed Matthias Boehm
         
        30.
        Vector primitives for row comparisons Sub-task Closed Matthias Boehm
         
        31.
        Handling of plan selection constraints (e.g., memory/blocksize) Sub-task Closed Matthias Boehm
         
        32.
        Multi-threaded compilation of fused operators Sub-task Closed Matthias Boehm
         
        33.
        Support 'replace' in row aggregate and cell templates Sub-task Closed Matthias Boehm
         
        34.
        Support rowMins/rowMaxs in row aggregate template Sub-task Closed Matthias Boehm
         
        35.
        Avoid dense allocation of empty sideway inputs Sub-task Closed Matthias Boehm
         
        36.
        Support cross-partition multi-aggregates with partial shared reads Sub-task Closed Matthias Boehm
         
        37.
        Additional meta operator template: SpoofRowwise Sub-task Closed Matthias Boehm
         
        38.
        Support log and exp row vector operations Sub-task Closed Matthias Boehm
         
        39.
        Extended compiler: leave hop dags unchanged during initial compilation Sub-task Closed Matthias Boehm
         
        40.
        Performance spark rowwise codegen instructions Sub-task Resolved Matthias Boehm
         
        41.
        Create additional common unary and binary row vector operations Sub-task Closed Matthias Boehm
         
        42.
        Generalize cell template for sideways row vectors inputs Sub-task Closed Matthias Boehm
         
        43. Extend cost model for distributed operations and broadcasts Sub-task Open Unassigned
         
        44.
        Multi-aggregates w/ dot products as aggregation roots Sub-task Closed Matthias Boehm
         
        45.
        Support codegen for matrix-matrix multiplications Sub-task Closed Matthias Boehm
         
        46.
        Include multi-aggregate plans into cost estimation Sub-task Closed Matthias Boehm
         
        47.
        Improve efficiency sparse-unsafe cellwise operations Sub-task Closed Matthias Boehm
         
        48.
        Improve handling of sparse outputs and sideway inputs Sub-task Closed Matthias Boehm
         
        49.
        Codegen support for various patterns in nn dml library Sub-task Closed Matthias Boehm
         
        50.
        Improvements to row templates to capture missed opportunities Sub-task Closed Matthias Boehm
         
        51.
        Worst-case size estimates for codegen fused operators Sub-task Closed Matthias Boehm
         
        52.
        Generalized row-wise template (scalar-vector, vector indexing) Sub-task Closed Matthias Boehm
         
        53.
        Determine minimal number of vector intermediates Sub-task Closed Matthias Boehm
         
        54.
        Common subexpression elimination for codegen plans Sub-task Closed Matthias Boehm
         
        55.
        Full aggregation support in rowwise templates Sub-task Closed Matthias Boehm
         
        56.
        Reduce garbage collection overhead of the codegen compiler Sub-task Closed Matthias Boehm
         
        57.
        Support for sparse-unsafe sparse vector primitives Sub-task Closed Matthias Boehm
         
        58.
        Rework codegen cost-based plan selector (opt V2) Sub-task Closed Matthias Boehm
         
        59.
        Support rowwise cumsum operations Sub-task Closed Matthias Boehm
         
        60. Simplify algorithm dml scripts Sub-task Open Unassigned
         
        61.
        Column-range indexing in rowwise templates Sub-task Closed Matthias Boehm
         
        62.
        Column aggregation in cellwise templates Sub-task Closed Matthias Boehm
         
        63.
        Rework codegen candidate exporation algorithm Sub-task Closed Matthias Boehm
         
        64. Automatically determine max memory/compute bandwidth Sub-task Open Unassigned
         
        65. Improve sparse-safe declaration of fused operators Sub-task Open Unassigned
         
        66.
        Extend outer template support for matrix-scalar ops on sparse driver Sub-task Closed Matthias Boehm
         
        67. Allow external codegen compiler configuration Sub-task Open Unassigned
         
        68. More aggressive dynamic recompilation w/ codegen Sub-task Open Unassigned
         
        69. Extended plan enumeration statistics Sub-task Open Unassigned
         

          Issues in Epic

            Activity

            There are no comments yet on this issue.

              People

              • Assignee:
                mboehm7 Matthias Boehm
                Reporter:
                mboehm7 Matthias Boehm
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Development