Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-11715

Add optimize program to organize optimization phases

    XMLWordPrintableJSON

Details

    Description

      Currently, Flink organizes the optimization phases by different methods in Batch(Stream)TableEnvironment#optimize. However this is not easy to extend especially there are more than ten optimization stages in Blink. On the other hand, the methods are very similar, except the match order and rule sets for hep optimization phases, target traits and rule sets for volcano optimization phases.

      Abstracts each optimization stage into a FlinkOptimizeProgram in Blink, defined as following:

      /**
        * Likes [[org.apache.calcite.tools.Program]], FlinkOptimizeProgram transforms a relational
        * expression into another relational expression.
        */
      trait FlinkOptimizeProgram[OC <: OptimizeContext] {
        def optimize(input: RelNode, context: OC): RelNode
      }
      

      FlinkOptimizeProgram's subclasses include
      1. FlinkRuleSetProgram, an abstract program can add/remove RuleSet, set target traits.
      2. FlinkHepRuleSetProgram, a subclass of FlinkRuleSetProgram which runs with HepPlanner.
      3. FlinkVolcanoProgram, a subclass of FlinkRuleSetProgram which runs with VolcanoPlanner.
      4. FlinkGroupProgram, a program contains a sequence of sub-programs as a group, programs in the group will be executed in sequence, and the group can be executed `iterations` times.
      ......

      FlinkChainedPrograms is responsible for organizing all the programs, each program's optimize method will be called in sequence when FlinkChainedPrograms#optimize is called.

      Attachments

        Issue Links

          Activity

            People

              godfreyhe godfrey he
              godfreyhe godfrey he
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m