Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-20435

Refactor ExecNode and ExecEdge

    XMLWordPrintableJSON

Details

    Description

      Currently, there are many improvements about ExecNode:
      1. simplify type parameter of ExecNode. Currently, ExecNode#translateToPlan takes BatchPlanner or StreamPlanner as a parameter, so ExecNode has a type parameter E <: Planner, which indicates the node is a batch node or a streaming node. While in the future, a plan may contain both batch nodes and stream node. The type parameter can be removed, and we will use PlannerBase instead.
      2. port the implementation of ExecNodes to Java
      3. separate the implementation of FlinkPhysicalRel and ExecNode. Currently, an execution node extends both from FlinkPhysicalRel and ExecNode. After a physical node is converted to an exec node, many parameters are unnecessary, such as: RelOptCluster, RelTraitSet, etc. With more optimizations on ExecNode, We need ExecNode to be cleaner and simpler. So we will separate the implementation of FlinkPhysicalRel and ExecNode.

      Currently, the ExecEdge represents the properties for the input of an operator, the properties include
      1. required data distribute for an input (the input corresponds to the Input)
      2. DamBehavior which describes the behaviors how an in record may trigger the output of the target operator.
      3. the priority of this input read by the target operator

      So ExecEdge should be rename to InputProperty, and we will re-introduce ExecEdge which describes how to connect two ExecNodes, and how to shuffle the data between two ExecNodes. Its role should be similar to StreamEdge.

      This is an umbrella issue, we will create more related sub-tasks.

      Attachments

        1.
        Simplify type parameter of ExecNode Sub-task Closed godfrey he
        2.
        Port ExecNode to Java Sub-task Closed godfrey he
        3.
        Adjust the explain result Sub-task Closed godfrey he
        4.
        Don't share inputs between FlinkPhysicalRel and ExecNode Sub-task Closed godfrey he
        5.
        Refactor verifyPlan methods in TableTestBase Sub-task Closed godfrey he
        6.
        Deprecate the classes in scala `nodes.exec` package Sub-task Closed godfrey he
        7.
        Introduce getDescription, getOutputType, replaceInput methods for ExecNode Sub-task Closed godfrey he
        8.
        Separate the implementation of BatchExecExchange and StreamExecExchange Sub-task Closed godfrey he
        9.
        Separate the implementation of BatchExecMultipleInput and StreamExecMultipleInput Sub-task Closed godfrey he
        10.
        Separate the implementation of BatchExecTableSourceScan and StreamExecTableSourceScan Sub-task Closed godfrey he
        11.
        Separate the implementation of BatchExecLegacyTableSourceScan and StreamExecLegacyTableSourceScan Sub-task Closed godfrey he
        12.
        Separate the implementation of BatchExecDataStreamScan and StreamExecDataStreamScan Sub-task Closed godfrey he
        13.
        Separate the implementation of BatchExecCalc and StreamExecCalc Sub-task Closed godfrey he
        14.
        Port BatchExecPythonCalc and StreamExecPythonCalc to Java Sub-task Closed Huang Xingbo
        15.
        Separate the implementation of StreamExecChangelogNormalize Sub-task Closed godfrey he
        16.
        Separate the implementation of StreamExecWatermarkAssigner Sub-task Closed godfrey he
        17.
        Introduce translateToExecNode method for FlinkPhysicalRel Sub-task Closed godfrey he
        18.
        ExecNode#getOutputType method should return LogicalType instead of RowType Sub-task Closed godfrey he
        19.
        Separate the implementation of BatchExecCorrelate and StreamExecCorrelate Sub-task Closed godfrey he
        20.
        Port BatchExecPythonCorrelate and StreamExecPythonCorrelate to Java Sub-task Closed Huang Xingbo
        21.
        Separate the implementation of BatchExecValues and StreamExecValues Sub-task Closed godfrey he
        22.
        Separate the implementation of BatchExecUnion and StreamExecUnion Sub-task Closed godfrey he
        23.
        Separate the implementation of BatchExecExpand and StreamExecExpand Sub-task Closed godfrey he
        24.
        Separate the implementation of StreamExecDropUpdateBefore Sub-task Closed godfrey he
        25.
        Separate the implementation of StreamExecMiniBatchAssigner Sub-task Closed godfrey he
        26.
        Change BatchExecNode & StreamExecNode to interface and make each node extended from ExecNodeBase directly Sub-task Closed godfrey he
        27.
        Separate the implementation of BatchExecLimit and StreamExecLimit Sub-task Closed Wenlong Lyu
        28.
        Separate the implementation of stream group aggregate nodes Sub-task Closed godfrey he
        29.
        Separate the implementation of batch group aggregate nodes Sub-task Closed godfrey he
        30.
        Port stream python group aggregate nodes to Java Sub-task Closed Huang Xingbo
        31.
        Port batch python group aggregate nodes to Java Sub-task Closed Huang Xingbo
        32.
        Separate the implementation of sort nodes Sub-task Closed Wenlong Lyu
        33.
        Separate the implementation of BatchExecRank Sub-task Closed Wenlong Lyu
        34.
        Separate the implementation of BatchExec nodes for Join Sub-task Closed Wenlong Lyu
        35.
        Separate the implementation of stream window aggregate nodes Sub-task Closed godfrey he
        36.
        Separate the implementation of batch window aggregate nodes Sub-task Closed godfrey he
        37.
        Port StreamExecPythonGroupWindowAggregate and BatchExecPythonGroupWindowAggregate to Java Sub-task Closed Huang Xingbo
        38.
        Separate the implementation of StreamExecTemporalJoin Sub-task Closed Wenlong Lyu
        39.
        Separate the implementation of BatchExecOverAggregate and StreamExecOverAggregate Sub-task Closed godfrey he
        40.
        Port StreamExecPythonOverAggregate and BatchExecPythonOverAggregate to Java Sub-task Closed Huang Xingbo
        41.
        Separate the implementation of StreamExecLookup and BatchExecLookup Sub-task Closed Wenlong Lyu
        42.
        Separate the implementation of StreamExecMatch Sub-task Closed godfrey he
        43.
        Separate the implementation of StreamExecDeduplicate Sub-task Closed godfrey he
        44.
        Separate the implementation of sink nodes Sub-task Closed godfrey he
        45.
        Rename Stream(/Batch)ExecIntermediateTableScan to Stream(/Batch)PhysicalIntermediateTableScan Sub-task Closed godfrey he
        46.
        Remove legacy exec nodes Sub-task Closed godfrey he
        47.
        Separete the implementation of StreamExecJoin Sub-task Closed Wenlong Lyu
        48.
        Separate the implementation of StreamExecIntervalJoin Sub-task Closed Wenlong Lyu
        49.
        Introduce ExecNodeGraph to wrap the ExecNode topology Sub-task Closed godfrey he
        50.
        Rename ExecEdge to InputProperty Sub-task Closed godfrey he
        51.
        Introduce ExecEdge to connect two ExecNodes Sub-task Closed godfrey he
        52.
        Remove BatchExecExchange and StreamExecExchange, and replace their functionality with ExecEdge Sub-task Closed Unassigned

        Activity

          People

            godfreyhe godfrey he
            godfreyhe godfrey he
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: