Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-3837

Three Table BucketMapJoin is failing

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      The following three table bucketmapjoin query returns 0 as result:

      set hive.optimize.bucketmapjoin = true;
      set hive.input.format = org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;

      CREATE TABLE t1 (key1 int, value1 string) partitioned by (ds1 string) CLUSTERED BY (key1) INTO 2 BUCKETS STORED AS TEXTFILE;
      load data local inpath '../data/files/srcbucket20.txt' INTO TABLE t1 partition(ds1='part1');
      load data local inpath '../data/files/srcbucket21.txt' INTO TABLE t1 partition(ds1='part1');
      load data local inpath '../data/files/srcbucket20.txt' INTO TABLE t1 partition(ds1='part2');
      load data local inpath '../data/files/srcbucket21.txt' INTO TABLE t1 partition(ds1='part2');

      CREATE TABLE t2 (key2 int, value2 string) partitioned by (ds2 string) CLUSTERED BY (key2) INTO 2 BUCKETS STORED AS TEXTFILE;
      load data local inpath '../data/files/srcbucket20.txt' INTO TABLE t2 partition(ds2='part1');
      load data local inpath '../data/files/srcbucket21.txt' INTO TABLE t2 partition(ds2='part1');
      load data local inpath '../data/files/srcbucket20.txt' INTO TABLE t2 partition(ds2='part2');
      load data local inpath '../data/files/srcbucket21.txt' INTO TABLE t2 partition(ds2='part2');

      CREATE TABLE t3 (key3 int, value3 string) partitioned by (ds3 string) CLUSTERED BY (key3) INTO 2 BUCKETS STORED AS TEXTFILE;
      load data local inpath '../data/files/srcbucket20.txt' INTO TABLE t3 partition(ds3='part1');
      load data local inpath '../data/files/srcbucket21.txt' INTO TABLE t3 partition(ds3='part1');

      – Three Tables Join
      explain extended
      select /+mapjoin(b,c)/ count
      from t1 a join t2 b on (a.key1=b.key2 and a.ds1=b.ds2) join t3 c on (a.key1=c.key3 and a.ds1=c.ds3);

      select /+mapjoin(b,c)/ count
      from t1 a join t2 b on (a.key1=b.key2 and a.ds1=b.ds2) join t3 c on (a.key1=c.key3 and a.ds1=c.ds3);

      It should return 1114(if we run a join without mapjoin).

      Attachments

        Issue Links

          Activity

            People

              zhenxiao Zhenxiao Luo
              zhenxiao Zhenxiao Luo
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: