Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-3055

TableScan type mismatch in VolcanoPlanner

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 1.19.0
    • 1.20.0
    • None

    Description

      In Drill with Calcite updated to 1.19.0 we observe following problem:

      the following query

      CREATE VIEW nation_view_testunionall_expandable_star AS select n_name, n_nationkey FROM cp.`tpch/nation.parquet`;
      (SELECT n_comment, n_regionkey FROM cp.`tpch/nation.parquet` LIMIT 5) UNION ALL (SELECT * FROM nation_view_testunionall_expandable_star  LIMIT 5)
      

      fails with

      15:46:06.005 [2333c7f1-89f3-0b80-8369-68e24895729e:foreman] ERROR o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR: AssertionError: Type mismatch:
      left:
      RecordType(DYNAMIC_STAR **, ANY n_comment, ANY n_regionkey) NOT NULL
      right:
      RecordType(DYNAMIC_STAR **, ANY n_comment, ANY n_regionkey, ANY n_name, ANY n_nationkey) NOT NULL
      

      when Volcano planner is registering rel nodes. The initial plan is the following:

      LogicalUnion(all=[true]): rowcount = 10.0, cumulative cost = \{520.0 rows, 940.755032994728 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 74
        LogicalSort(fetch=[5]): rowcount = 5.0, cumulative cost = \{205.0 rows, 365.377516497364 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 68
          LogicalProject(n_comment=[$1], n_regionkey=[$2]): rowcount = 100.0, cumulative cost = \{200.0 rows, 301.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 67
            EnumerableTableScan(table=[[cp, tpch/nation.parquet]]): rowcount = 100.0, cumulative cost = \{100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 29
        LogicalSort(fetch=[5]): rowcount = 5.0, cumulative cost = \{305.0 rows, 565.377516497364 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 73
          LogicalProject(n_name=[$0], n_nationkey=[$1]): rowcount = 100.0, cumulative cost = \{300.0 rows, 501.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 72
            LogicalProject(n_name=[$3], n_nationkey=[$4]): rowcount = 100.0, cumulative cost = \{200.0 rows, 301.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 71
              EnumerableTableScan(table=[[cp, tpch/nation.parquet]]): rowcount = 100.0, cumulative cost = \{100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 34
      

      It contains two EnumerableTableScan operators with different row types specified in the error message above. In the scope of CALCITE-2454 was removed rel data type from the key in the VolcanoPlanner.mapDigestToRel map.

      To fix this, `TableScan` should include rowType item in AbstractRelNode#explainTerms(RelWriter) method because in case of dynamic tables there is the same digest for different operators. The same thing was done for Values operator and literals in conditions.

      Attachments

        Issue Links

          Activity

            People

              danny0405 Danny Chen
              bohdan Bohdan Kazydub
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h