Uploaded image for project: 'Apache Nemo'
  1. Apache Nemo
  2. NEMO-183

DAG-centric translation from Beam pipeline to IR DAG

    XMLWordPrintableJSON

Details

    Description

      In current Beam frontend, NemoPipelineVisitor defines 1:1 mapping between Beam PrimitiveTransform and Nemo IR vertex. Though this "Transform-centric" translation have worked so far because of its simple design, it has limitations worth considering.

      • As we develop other frontends other than Beam, 1:1 correspondence between PrimitiveTransform and IRVertex can be easily break, because we'll try to make Nemo IRs neutral to dataflow langauges.
      • Same PrimitiveTransform would require different translation behavior based on what CompositeTransform it belongs to. For example, PipelineVisitor should add additional vertices to implement mapper-side combiner, if it detects a GroupByKey in Combine transform.
      • Loops. LoopCompositeTransform is inherently a CompositeTransform and each LoopCompositeTransform requires its own context to determine 'depth' of each vertex in a loop.

      As an alternative, I suggest DAG-centric translation. It requires two phases in translation

      • PipelineVisitor traverses through the given Beam pipeline to construct DAG of Beam transforms, while preserving the hierarchy of CompositeTransforms.
      • PipelineTranslator defines not only mappings between PrimitiveTransform and IRVertex, but also correspondneces between CompositeTransform and TranslationContext, based on which PipelineTranslator can tune translation behavior.

      Attachments

        Issue Links

          Activity

            People

              jangho JangHo Seo
              jangho JangHo Seo
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: