Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-11604

HIVE return wrong results in some queries with PTF function

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.2.0, 1.1.0, 2.0.0
    • Fix Version/s: 2.0.0
    • Component/s: Logical Optimizer
    • Labels:
      None
    • Target Version/s:

      Description

      Following query returns empty result which is not right:

      select ddd.id, ddd.fkey, aaa.name
      from (
          select id, fkey, 
          row_number() over (partition by id, fkey) as rnum
          from tlb1 group by id, fkey
       ) ddd 
      inner join tlb2 aaa on aaa.fid = ddd.fkey;
      

      After remove row_number() over (partition by id, fkey) as rnum from query, the right result returns.

      Reproduce:

      create table tlb1 (id int, fkey int, val string);
      create table tlb2 (fid int, name string);
      insert into table tlb1 values(100,1,'abc');
      insert into table tlb1 values(200,1,'efg');
      insert into table tlb2 values(1, 'key1');
      
      select ddd.id, ddd.fkey, aaa.name
      from (
          select id, fkey, 
          row_number() over (partition by id, fkey) as rnum
          from tlb1 group by id, fkey
       ) ddd 
      inner join tlb2 aaa on aaa.fid = ddd.fkey;
      ....
      INFO  : Ended Job = job_local1070163923_0017
      +---------+-----------+-----------+--+
      No rows selected (14.248 seconds)
      | ddd.id  | ddd.fkey  | aaa.name  |
      +---------+-----------+-----------+--+
      +---------+-----------+-----------+--+
      
      0: jdbc:hive2://localhost:10000> select ddd.id, ddd.fkey, aaa.name
      from (
          select id, fkey 
          from tlb1 group by id, fkey
       ) ddd 
      inner join tlb2 aaa on aaa.fid = ddd.fkey;select ddd.id, ddd.fkey, aaa.name
      0: jdbc:hive2://localhost:10000> from (
      0: jdbc:hive2://localhost:10000>     select id, fkey 
      0: jdbc:hive2://localhost:10000>     from tlb1 group by id, fkey
      0: jdbc:hive2://localhost:10000>  ) ddd 
      0: jdbc:hive2://localhost:10000> 
      inner join tlb2 aaa on aaa.fid = ddd.fkey;
      INFO  : Number of reduce tasks not specified. Estimated from input data size: 1
      ...
      INFO  : Ended Job = job_local672340505_0019
      +---------+-----------+-----------+--+
      2 rows selected (14.383 seconds)
      | ddd.id  | ddd.fkey  | aaa.name  |
      +---------+-----------+-----------+--+
      | 100     | 1         | key1      |
      | 200     | 1         | key1      |
      +---------+-----------+-----------+--+
      
      

        Attachments

        1. HIVE-11604.2.patch
          30 kB
          Yongzhi Chen
        2. HIVE-11604.1.patch
          19 kB
          Yongzhi Chen

          Activity

            People

            • Assignee:
              ychena Yongzhi Chen
              Reporter:
              ychena Yongzhi Chen
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: