Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-11604

HIVE return wrong results in some queries with PTF function

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.1.0, 1.2.0, 2.0.0
    • 2.0.0
    • Logical Optimizer
    • None

    Description

      Following query returns empty result which is not right:

      select ddd.id, ddd.fkey, aaa.name
      from (
          select id, fkey, 
          row_number() over (partition by id, fkey) as rnum
          from tlb1 group by id, fkey
       ) ddd 
      inner join tlb2 aaa on aaa.fid = ddd.fkey;
      

      After remove row_number() over (partition by id, fkey) as rnum from query, the right result returns.

      Reproduce:

      create table tlb1 (id int, fkey int, val string);
      create table tlb2 (fid int, name string);
      insert into table tlb1 values(100,1,'abc');
      insert into table tlb1 values(200,1,'efg');
      insert into table tlb2 values(1, 'key1');
      
      select ddd.id, ddd.fkey, aaa.name
      from (
          select id, fkey, 
          row_number() over (partition by id, fkey) as rnum
          from tlb1 group by id, fkey
       ) ddd 
      inner join tlb2 aaa on aaa.fid = ddd.fkey;
      ....
      INFO  : Ended Job = job_local1070163923_0017
      +---------+-----------+-----------+--+
      No rows selected (14.248 seconds)
      | ddd.id  | ddd.fkey  | aaa.name  |
      +---------+-----------+-----------+--+
      +---------+-----------+-----------+--+
      
      0: jdbc:hive2://localhost:10000> select ddd.id, ddd.fkey, aaa.name
      from (
          select id, fkey 
          from tlb1 group by id, fkey
       ) ddd 
      inner join tlb2 aaa on aaa.fid = ddd.fkey;select ddd.id, ddd.fkey, aaa.name
      0: jdbc:hive2://localhost:10000> from (
      0: jdbc:hive2://localhost:10000>     select id, fkey 
      0: jdbc:hive2://localhost:10000>     from tlb1 group by id, fkey
      0: jdbc:hive2://localhost:10000>  ) ddd 
      0: jdbc:hive2://localhost:10000> 
      inner join tlb2 aaa on aaa.fid = ddd.fkey;
      INFO  : Number of reduce tasks not specified. Estimated from input data size: 1
      ...
      INFO  : Ended Job = job_local672340505_0019
      +---------+-----------+-----------+--+
      2 rows selected (14.383 seconds)
      | ddd.id  | ddd.fkey  | aaa.name  |
      +---------+-----------+-----------+--+
      | 100     | 1         | key1      |
      | 200     | 1         | key1      |
      +---------+-----------+-----------+--+
      
      

      Attachments

        1. HIVE-11604.2.patch
          30 kB
          Yongzhi Chen
        2. HIVE-11604.1.patch
          19 kB
          Yongzhi Chen

        Activity

          People

            ychena Yongzhi Chen
            ychena Yongzhi Chen
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: