As suggested in https://issues.apache.org/jira/browse/STORM-1040?focusedCommentId=15036651&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15036651, compiling the logical plan from Calcite down to Storm physical plan will clarify the implementation of StormSQL.
For me having own relational algebras (rel) has several advantages,
- We can push operator handling logic to rel itself. Before that we should traverse Calcite logical rel tree with PostOrderRelNodeVisitor, and visitor needs to handle Calcite's rel directly. Now the logic how to configure Trident topology is all handled from separate rels.
- We sometimes want to have more derived rels compared to Calcite logical operators. One of example is Join. There's only one logical rel regarding join in Calcite - LogicalJoin - but we're now converting LogicalJoin to EquiJoin if conditions are met. If we have various types of join it will make the difference. We're not prepared yet, but streaming scan vs table scan, and streaming insert vs table insert are the other cases.
- We can even override the methods how to represent the rel in explain string if we think Calcite's explain is less informational. For example, showing initial parallelism (when we support) for Scan.
- We can apply query optimizations: Defining derived rels helps further query optimizations, like filter pushdown. Calcite rels is not aware of data source characteristic, and we can include it to our own rels.