Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-11410

Join with subquery containing a group by incorrectly returns no results

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Cannot Reproduce
    • 1.1.0
    • None
    • Hive
    • None

    Description

      Start by creating a table t with columns c1 and c2 and populate with 1 row of data. For example create table t from an existing table which contains at least 1 row of data by running:

      create table t as select 'abc' as c1, 0 as c2 from Y limit 1; 
      

      Table t looks like the following:

      c1 c2
      abc 0

      Running the following query then returns zero results.

      SELECT 
        t1.c1
      FROM 
        t t1
      JOIN
      (SELECT 
         t2.c1,
         MAX(t2.c2) AS c2
       FROM 
         t t2 
       GROUP BY 
         t2.c1
      ) t3
      ON t1.c2=t3.c2
      

      However, we expected to see the following:

      c1
      abc

      The problem seems to relate to the fact that in the subquery, we group by column c1, but this is not subsequently used in the join condition.

      Attachments

        1. hive-site.xml
          8 kB
          Nicholas Brenwald

        Activity

          People

            mmccline Matt McCline
            nbrenwald Nicholas Brenwald
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: