Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-3055

TableScan type mismatch in VolcanoPlanner

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 1.19.0
    • 1.20.0
    • None

    Description

      In Drill with Calcite updated to 1.19.0 we observe following problem:

      the following query

      CREATE VIEW nation_view_testunionall_expandable_star AS select n_name, n_nationkey FROM cp.`tpch/nation.parquet`;
      (SELECT n_comment, n_regionkey FROM cp.`tpch/nation.parquet` LIMIT 5) UNION ALL (SELECT * FROM nation_view_testunionall_expandable_star  LIMIT 5)
      

      fails with

      15:46:06.005 [2333c7f1-89f3-0b80-8369-68e24895729e:foreman] ERROR o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR: AssertionError: Type mismatch:
      left:
      RecordType(DYNAMIC_STAR **, ANY n_comment, ANY n_regionkey) NOT NULL
      right:
      RecordType(DYNAMIC_STAR **, ANY n_comment, ANY n_regionkey, ANY n_name, ANY n_nationkey) NOT NULL
      

      when Volcano planner is registering rel nodes. The initial plan is the following:

      LogicalUnion(all=[true]): rowcount = 10.0, cumulative cost = \{520.0 rows, 940.755032994728 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 74
        LogicalSort(fetch=[5]): rowcount = 5.0, cumulative cost = \{205.0 rows, 365.377516497364 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 68
          LogicalProject(n_comment=[$1], n_regionkey=[$2]): rowcount = 100.0, cumulative cost = \{200.0 rows, 301.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 67
            EnumerableTableScan(table=[[cp, tpch/nation.parquet]]): rowcount = 100.0, cumulative cost = \{100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 29
        LogicalSort(fetch=[5]): rowcount = 5.0, cumulative cost = \{305.0 rows, 565.377516497364 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 73
          LogicalProject(n_name=[$0], n_nationkey=[$1]): rowcount = 100.0, cumulative cost = \{300.0 rows, 501.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 72
            LogicalProject(n_name=[$3], n_nationkey=[$4]): rowcount = 100.0, cumulative cost = \{200.0 rows, 301.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 71
              EnumerableTableScan(table=[[cp, tpch/nation.parquet]]): rowcount = 100.0, cumulative cost = \{100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 34
      

      It contains two EnumerableTableScan operators with different row types specified in the error message above. In the scope of CALCITE-2454 was removed rel data type from the key in the VolcanoPlanner.mapDigestToRel map.

      To fix this, `TableScan` should include rowType item in AbstractRelNode#explainTerms(RelWriter) method because in case of dynamic tables there is the same digest for different operators. The same thing was done for Values operator and literals in conditions.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            danny0405 Danny Chen
            bohdan Bohdan Kazydub
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 0.5h
                0.5h

                Slack

                  Issue deployment