Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-4894

MV rewriting fails for conjunctive top expressions in SELECT clause

    XMLWordPrintableJSON

Details

    Description

      MV rewrite fails when at least one expression in the project of either the view or the query references, directly or indirectly, to more than one field.

      Consider a view with an expression of the form "f between 1 and 3" expression (which under the hood becomes "f >= 1 and f <= 3", so effectively referencing the same field twice):

      @Test void testViewProjectWithBetween() {
        sql("select s.\"time_id\", s.\"time_id\" between 1 and 3"
                + " from \"foodmart\".\"sales_fact_1997\" as s"
                + " where s.\"store_id\" = 1",
            "select s.\"time_id\""
                + " from \"foodmart\".\"sales_fact_1997\" as s"
                + " where s.\"store_id\" = 1")
            .withDefaultSchemaSpec(CalciteAssert.SchemaSpec.JDBC_FOODMART)
            .ok();
      }

      It fails as follows:

      FAILURE   6.9sec, org.apache.calcite.test.MaterializedViewRelOptRulesTest > testViewProjectWithBetween()
          java.lang.AssertionError
              at org.apache.calcite.rel.rules.materialize.MaterializedViewRule.generateSwapTableColumnReferencesLineage(MaterializedViewRule.java:1046)
              at org.apache.calcite.rel.rules.materialize.MaterializedViewRule.rewriteExpressions(MaterializedViewRule.java:1005)
              at org.apache.calcite.rel.rules.materialize.MaterializedViewJoinRule.rewriteView(MaterializedViewJoinRule.java:278)
              at org.apache.calcite.rel.rules.materialize.MaterializedViewRule.perform(MaterializedViewRule.java:475)
              at org.apache.calcite.rel.rules.materialize.MaterializedViewOnlyFilterRule.onMatch(MaterializedViewOnlyFilterRule.java:50)
              at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:239)
              at org.apache.calcite.plan.volcano.IterativeRuleDriver.drive(IterativeRuleDriver.java:61)
              at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:523)
              at org.apache.calcite.tools.Programs.lambda$standard$3(Programs.java:276)
              at org.apache.calcite.tools.Programs$SequenceProgram.run(Programs.java:336)
              at org.apache.calcite.test.MaterializedViewRelOptRulesTest.optimize(MaterializedViewRelOptRulesTest.java:1210)
              at org.apache.calcite.test.AbstractMaterializedViewTest.checkMaterialize(AbstractMaterializedViewTest.java:109)
              at org.apache.calcite.test.AbstractMaterializedViewTest.access$000(AbstractMaterializedViewTest.java:69)
              at org.apache.calcite.test.AbstractMaterializedViewTest$Sql.ok(AbstractMaterializedViewTest.java:230)
              at org.apache.calcite.test.MaterializedViewRelOptRulesTest.testViewProjectWithBetween(MaterializedViewRelOptRulesTest.java:60)

       

      Similarly when the same kind of expression is present in the query:

      @Test void testQueryProjectWithBetween() {
        sql("select *"
                + " from \"foodmart\".\"sales_fact_1997\" as s"
                + " where s.\"store_id\" = 1",
            "select s.\"time_id\" between 1 and 3"
                + " from \"foodmart\".\"sales_fact_1997\" as s"
                + " where s.\"store_id\" = 1")
            .withDefaultSchemaSpec(CalciteAssert.SchemaSpec.JDBC_FOODMART)
            .ok();
      } 

      Calcite fails as follows:

      FAILURE   5.5sec, org.apache.calcite.test.MaterializedViewRelOptRulesTest > testQueryProjectWithBetween()
          java.lang.AssertionError
              at org.apache.calcite.rel.rules.materialize.MaterializedViewJoinRule.rewriteView(MaterializedViewJoinRule.java:268)
              at org.apache.calcite.rel.rules.materialize.MaterializedViewRule.perform(MaterializedViewRule.java:475)
              at org.apache.calcite.rel.rules.materialize.MaterializedViewProjectFilterRule.onMatch(MaterializedViewProjectFilterRule.java:53)
              at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:239)
              at org.apache.calcite.plan.volcano.IterativeRuleDriver.drive(IterativeRuleDriver.java:61)
              at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:523)
              at org.apache.calcite.tools.Programs.lambda$standard$3(Programs.java:276)
              at org.apache.calcite.tools.Programs$SequenceProgram.run(Programs.java:336)
              at org.apache.calcite.test.MaterializedViewRelOptRulesTest.optimize(MaterializedViewRelOptRulesTest.java:1210)
              at org.apache.calcite.test.AbstractMaterializedViewTest.checkMaterialize(AbstractMaterializedViewTest.java:109)
              at org.apache.calcite.test.AbstractMaterializedViewTest.access$000(AbstractMaterializedViewTest.java:69)
              at org.apache.calcite.test.AbstractMaterializedViewTest$Sql.ok(AbstractMaterializedViewTest.java:230)
              at org.apache.calcite.test.MaterializedViewRelOptRulesTest.testQueryProjectWithBetween(MaterializedViewRelOptRulesTest.java:49)

      Both MaterializedViewJoinRule (failing for queries) and MaterializedViewRule (failing for views) share the same logic for rewriting expressions between the view and the query:

      "MetadataQuery::getExpressionLineage" is used to get the original lineage of view/query expressions.

      Code comments state that a single lineage is expected because we don't support union:

      // We only support project - filter - join, thus it should map to
      // a single expression

      However, in presence of complex expressions referencing more than one input field, this is not true, and the assertion fails.

      The code should be extended to support such expressions.

      Attachments

        Issue Links

          Activity

            People

              asolimando Alessandro Solimando
              asolimando Alessandro Solimando
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 5h 20m
                  5h 20m