Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-12566

Incorrect result returns when using COALESCE in WHERE condition with LEFT JOIN

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 0.13.0
    • Fix Version/s: 2.0.0
    • Component/s: Query Planning
    • Labels:
      None

      Description

      The left join query with on/where clause returns incorrect result (more rows are returned). See the reproducible sample below.
      Left table with data:

      CREATE TABLE ltable (i int, la int, lk1 string, lk2 string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
      ---
      1,\N,CD5415192314304,00071
      2,\N,CD5415192225530,00071
      

      Right table with data:

      CREATE TABLE rtable (ra int, rk1 string, rk2 string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
      ---
      1,CD5415192314304,00071
      45,CD5415192314304,00072
      

      Query:

      SELECT * FROM ltable l LEFT OUTER JOIN rtable r on (l.lk1 = r.rk1 AND l.lk2 = r.rk2) WHERE COALESCE(l.la,'EMPTY')=COALESCE(r.ra,'EMPTY');
      

      Result returns:

      1	NULL	CD5415192314304	00071	NULL	NULL	NULL
      2	NULL	CD5415192225530	00071	NULL	NULL	NULL
      

      The correct result should be

      2	NULL	CD5415192225530	00071	NULL	NULL	NULL
      

        Attachments

        1. HIVE-12566.1.patch
          12 kB
          Chaoyu Tang
        2. HIVE-12566.patch
          13 kB
          Chaoyu Tang

          Activity

            People

            • Assignee:
              ctang Chaoyu Tang
              Reporter:
              ctang Chaoyu Tang
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: