Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-17503

CBO: Add "Explain CBO" to print Calcite trees

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0
    • 4.0.0
    • CBO

    Description

      The calcite tree is only logged at debug level in Hive right now, which is inconvenient to debug the CBO issues with selectivity and join rotations.

      The Calcite plans, before being sent to the rest of the optimizers end up looking like

      HiveProject(s_store_name=[$0], s_company_id=[$1], s_street_number=[$2], s_street_name=[$3], s_street_type=[$4], s_suite_number=[$5], s_city=[$6], s_county=[$7], s_state=[$8], s_zip=[$9], 30days=[$10], 3160days=[$11], 6190days=[$12], 91120days=[$13], 120days=[$14])
          HiveAggregate(group=[{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}], agg#0=[sum($10)], agg#1=[sum($11)], agg#2=[sum($12)], agg#3=[sum($13)], agg#4=[sum($14)])
            HiveProject($f0=[$14], $f1=[$15], $f2=[$16], $f3=[$17], $f4=[$18], $f5=[$19], $f6=[$20], $f7=[$21], $f8=[$22], $f9=[$23], $f10=[CASE(<=(-($8, $4), CAST(30):BIGINT), 1, 0)], $f11=[CASE(AND(>(-($8, $4), CAST(30):BIGINT), <=(-($8, $4), CAST(60):BIGINT)), 1, 0)], $f12=[CASE(AND(>(-($8, $4), CAST(60):BIGINT), <=(-($8, $4), CAST(90):BIGINT)), 1, 0)], $f13=[CASE(AND(>(-($8, $4), CAST(90):BIGINT), <=(-($8, $4), CAST(120):BIGINT)), 1, 0)], $f14=[CASE(>(-($8, $4), CAST(120):BIGINT), 1, 0)])
              HiveJoin(condition=[=($2, $13)], joinType=[inner], algorithm=[none], cost=[not available])
                HiveJoin(condition=[=($4, $12)], joinType=[inner], algorithm=[none], cost=[not available])
                  HiveJoin(condition=[AND(=($0, $5), =($1, $6), =($3, $7))], joinType=[inner], algorithm=[none], cost=[not available])
                    HiveProject(ss_item_sk=[$1], ss_customer_sk=[$2], ss_store_sk=[$6], ss_ticket_number=[$8], ss_sold_date_sk=[$22])
                      HiveFilter(condition=[AND(IS NOT NULL($1), IS NOT NULL($2), IS NOT NULL($8), IS NOT NULL($6), IS NOT NULL($22))])
                        HiveTableScan(table=[[tpcds_bin_partitioned_orc_10000.store_sales]], table:alias=[store_sales])
                    HiveJoin(condition=[=($3, $4)], joinType=[inner], algorithm=[none], cost=[not available])
                      HiveProject(sr_item_sk=[$1], sr_customer_sk=[$2], sr_ticket_number=[$8], sr_returned_date_sk=[$19])
                        HiveFilter(condition=[AND(IS NOT NULL($1), IS NOT NULL($2), IS NOT NULL($8), IS NOT NULL($19))])
                          HiveTableScan(table=[[tpcds_bin_partitioned_orc_10000.store_returns]], table:alias=[store_returns])
                      HiveProject(d_date_sk=[$0], d_year=[CAST(2000):INTEGER], d_moy=[CAST(9):INTEGER])
                        HiveFilter(condition=[AND(=($6, 2000), =($8, 9), IS NOT NULL($0))])
                          HiveTableScan(table=[[tpcds_bin_partitioned_orc_10000.date_dim]], table:alias=[d2])
                  HiveProject(d_date_sk=[$0])
                    HiveFilter(condition=[IS NOT NULL($0)])
                      HiveTableScan(table=[[tpcds_bin_partitioned_orc_10000.date_dim]], table:alias=[d1])
                HiveProject(s_store_sk=[$0], s_store_name=[$5], s_company_id=[$16], s_street_number=[$18], s_street_name=[$19], s_street_type=[$20], s_suite_number=[$21], s_city=[$22], s_county=[$23], s_state=[$24], s_zip=[$25])
                  HiveFilter(condition=[IS NOT NULL($0)])
                    HiveTableScan(table=[[tpcds_bin_partitioned_orc_10000.store]], table:alias=[store])
      

      Attachments

        Issue Links

          Activity

            People

              jcamachorodriguez Jesus Camacho Rodriguez
              gopalv Gopal Vijayaraghavan
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: