Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-3055

TableScan type mismatch in VolcanoPlanner

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 1.19.0
    • Fix Version/s: 1.20.0
    • Component/s: None

      Description

      In Drill with Calcite updated to 1.19.0 we observe following problem:

      the following query

      CREATE VIEW nation_view_testunionall_expandable_star AS select n_name, n_nationkey FROM cp.`tpch/nation.parquet`;
      (SELECT n_comment, n_regionkey FROM cp.`tpch/nation.parquet` LIMIT 5) UNION ALL (SELECT * FROM nation_view_testunionall_expandable_star  LIMIT 5)
      

      fails with

      15:46:06.005 [2333c7f1-89f3-0b80-8369-68e24895729e:foreman] ERROR o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR: AssertionError: Type mismatch:
      left:
      RecordType(DYNAMIC_STAR **, ANY n_comment, ANY n_regionkey) NOT NULL
      right:
      RecordType(DYNAMIC_STAR **, ANY n_comment, ANY n_regionkey, ANY n_name, ANY n_nationkey) NOT NULL
      

      when Volcano planner is registering rel nodes. The initial plan is the following:

      LogicalUnion(all=[true]): rowcount = 10.0, cumulative cost = \{520.0 rows, 940.755032994728 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 74
        LogicalSort(fetch=[5]): rowcount = 5.0, cumulative cost = \{205.0 rows, 365.377516497364 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 68
          LogicalProject(n_comment=[$1], n_regionkey=[$2]): rowcount = 100.0, cumulative cost = \{200.0 rows, 301.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 67
            EnumerableTableScan(table=[[cp, tpch/nation.parquet]]): rowcount = 100.0, cumulative cost = \{100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 29
        LogicalSort(fetch=[5]): rowcount = 5.0, cumulative cost = \{305.0 rows, 565.377516497364 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 73
          LogicalProject(n_name=[$0], n_nationkey=[$1]): rowcount = 100.0, cumulative cost = \{300.0 rows, 501.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 72
            LogicalProject(n_name=[$3], n_nationkey=[$4]): rowcount = 100.0, cumulative cost = \{200.0 rows, 301.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 71
              EnumerableTableScan(table=[[cp, tpch/nation.parquet]]): rowcount = 100.0, cumulative cost = \{100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 34
      

      It contains two EnumerableTableScan operators with different row types specified in the error message above. In the scope of CALCITE-2454 was removed rel data type from the key in the VolcanoPlanner.mapDigestToRel map.

      To fix this, `TableScan` should include rowType item in AbstractRelNode#explainTerms(RelWriter) method because in case of dynamic tables there is the same digest for different operators. The same thing was done for Values operator and literals in conditions.

        Attachments

        Issue Links

          Activity

            People

            • Assignee:
              danny0405 Danny Chen
              Reporter:
              bohdan Bohdan Kazydub

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 0.5h
                0.5h

                  Issue deployment