Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-16698

HoS should avoid mapjoin optimization in case of union and using table stats

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.0.0
    • 3.0.0
    • Logical Optimizer, Spark
    • None

    Description

      When hive.spark.use.ts.stats.for.mapjoin is true, HoS would not check whether the big table branch has upstream UNION operators. This is wrong and could generate incorrect plan. To reproduce:

      set hive.auto.convert.join=true;
      set hive.auto.convert.join.noconditionaltask.size=16;
      set hive.spark.use.ts.stats.for.mapjoin=true;
      
      create table a (c1 string, c2 int);
      create table b (c3 string, c4 int);
      create table c (c1 string, c2 int);
      create table d (c3 string, c4 int);
      create table e (c5 string, c6 int);
      insert into table a values
      ("a1", 1), ("a2", 2), ("a3", 3), ("a4", 4), ("a5", 5), ("a6", 6), ("a7", 7);
      insert into table b values
      ("b1", 1), ("b2", 2), ("b3", 3), ("b4", 4);
      insert into table c values
      ("c1", 1), ("c2", 2), ("c3", 3), ("c4", 4), ("c5", 5), ("c6", 6), ("c7", 7);
      insert into table d values
      ("d1", 1), ("d2", 2), ("d3", 3), ("d4", 4);
      insert into table e values
      ("d1", 1), ("d2", 2);
      
      explain
      with t1 as (
      select a.c1 as c1, a.c2 as c2, b.c3 as c3 from a join b on a.c2 = b.c4
      ),
      t2 as (
      select c.c1 as c1, c.c2 as c2, d.c3 as c3 from c join d on c.c2 = d.c4
      ),
      t3 as (
      select * from t1 union all select * from t2
      ),
      t4 as (
      select t3.c1, t3.c3, t5.c5 from t3 join e as t5 on t3.c2 = t5.c6
      )
      select * from t4;
      

      Attachments

        1. HIVE-16698.1.patch
          32 kB
          Chao Sun

        Issue Links

          Activity

            People

              csun Chao Sun
              csun Chao Sun
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: