Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-5375

Nested loop join: return correct result for left join

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      Mini repro:
      1. Create 2 Hive tables with data

      CREATE TABLE t1 (
        FYQ varchar(999),
        dts varchar(999),
        dte varchar(999)
      )
      ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
      
      2016-Q1,2016-06-01,2016-09-30
      2016-Q2,2016-09-01,2016-12-31
      2016-Q3,2017-01-01,2017-03-31
      2016-Q4,2017-04-01,2017-06-30
      
      CREATE TABLE t2 (
        who varchar(999),
        event varchar(999),
        dt varchar(999)
      )
      ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
      
      aperson,did somthing,2017-01-06
      aperson,did somthing else,2017-01-12
      aperson,had chrsitmas,2016-12-26
      aperson,went wild,2016-01-01
      

      2. Impala Query shows correct result

      select t2.dt, t1.fyq, t2.who, t2.event
      from t2
      left join t1 on t2.dt between t1.dts and t1.dte
      order by t2.dt;
      +------------+---------+---------+-------------------+
      | dt         | fyq     | who     | event             |
      +------------+---------+---------+-------------------+
      | 2016-01-01 | NULL    | aperson | went wild         |
      | 2016-12-26 | 2016-Q2 | aperson | had chrsitmas     |
      | 2017-01-06 | 2016-Q3 | aperson | did somthing      |
      | 2017-01-12 | 2016-Q3 | aperson | did somthing else |
      +------------+---------+---------+-------------------+
      

      3. Drill query shows wrong results:

      alter session set planner.enable_nljoin_for_scalar_only=false;
      use hive;
      select t2.dt, t1.fyq, t2.who, t2.event
      from t2
      left join t1 on t2.dt between t1.dts and t1.dte
      order by t2.dt;
      
      +-------------+----------+----------+--------------------+
      |     dt      |   fyq    |   who    |       event        |
      +-------------+----------+----------+--------------------+
      | 2016-12-26  | 2016-Q2  | aperson  | had chrsitmas      |
      | 2017-01-06  | 2016-Q3  | aperson  | did somthing       |
      | 2017-01-12  | 2016-Q3  | aperson  | did somthing else  |
      +-------------+----------+----------+--------------------+
      3 rows selected (2.523 seconds)
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            arina Arina Ielchiieva
            arina Arina Ielchiieva
            Aman Sinha Aman Sinha
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment