Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-5367

Join query returns wrong results

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.10.0
    • None
    • Execution - Flow
    • None
    • 3 node cluster

    Description

      Join query returns wrong results

      Drill 1.10.0 does not return any results.

      0: jdbc:drill:schema=dfs.tmp> SELECT * FROM using_f1 JOIN (SELECT * FROM using_f2) foo USING(col_prime);
      +---------+------------+------------+-------------+---------+-----------+----------+-------------+-------------+--------------+----------+------------+
      | col_dt  | col_state  | col_prime  | col_varstr  | col_id  | col_name  | col_dt0  | col_state0  | col_prime0  | col_varstr0  | col_id0  | col_name0  |
      +---------+------------+------------+-------------+---------+-----------+----------+-------------+-------------+--------------+----------+------------+
      +---------+------------+------------+-------------+---------+-----------+----------+-------------+-------------+--------------+----------+------------+
      No rows selected (0.314 seconds)
      
      Explain plan for above failing query
      
      0: jdbc:drill:schema=dfs.tmp> explain plan for SELECT * FROM using_f1 JOIN (SELECT * FROM using_f2) foo USING(col_prime);
      +------+------+
      | text | json |
      +------+------+
      | 00-00    Screen
      00-01      ProjectAllowDup(*=[$0], *0=[$1])
      00-02        Project(T49¦¦*=[$0], T48¦¦*=[$2])
      00-03          Project(T49¦¦*=[$1], col_prime=[$2], T48¦¦*=[$0])
      00-04            HashJoin(condition=[=($2, $0)], joinType=[inner])
      00-06              Project(T48¦¦*=[$0])
      00-08                Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tmp/using_f2]], selectionRoot=maprfs:/tmp/using_f2, numFiles=1, usedMetadataFile=false, columns=[`*`]]])
      00-05              Project(T49¦¦*=[$0], col_prime=[$1])
      00-07                Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tmp/using_f1]], selectionRoot=maprfs:/tmp/using_f1, numFiles=1, usedMetadataFile=false, columns=[`*`]]])
      

      Whereas Postgres 9.3 returns expected results for the same data.

      postgres=# SELECT * FROM using_f1 JOIN (SELECT * FROM using_f2) foo USING(col_prime);
       col_prime |   col_dt   | col_state |                          col_varstr
                      | col_id |     col_name      |   col_dt   | col_state |
                    col_varstr                            | col_id |       col_name
      -----------+------------+-----------+-----------------------------------------------
      ----------------+--------+-------------------+------------+-----------+-------------
      ----------------------------------------------------+--------+----------------------
             103 | 2014-12-24 | TX        | LUW2QzWGdJfnxHrqm3vwyndzRBFwH8l5xVDaM3hTiZAanp
      j               |  19462 | Julie Lennox      | 1990-01-11 | WV        | KKzEOgle6E5h
      NANduNAAIp9DQnGLGxO                                 |  54217 | Derek Wilson
             103 | 1985-07-18 | CA        | aYQ2uLpPxebGGRvcX0fahrAOO4yhkDRvMPES6PuYsIfwkU
      Mrcq6NSdt0j     |  48987 | Lillian Lupo      | 1990-01-11 | WV        | KKzEOgle6E5h
      NANduNAAIp9DQnGLGxO                                 |  54217 | Derek Wilson
             103 | 1988-02-27 | SC        | OcVKheHMyeKLgcvamrJHUxKyCGGJGci3Y9ht2LI9T5Ek1n
      wckB            |  52840 | Martha Rose       | 1990-01-11 | WV        | KKzEOgle6E5h
      NANduNAAIp9DQnGLGxO                                 |  54217 | Derek Wilson
             211 | 1989-12-06 | SD        | HHlmvV4
                      |   1131 | Kenneth Hayes     | 1989-05-31 | MT        | yhHfCGaCqnAr
      XUCD4jRoZQ4fj6IQIKZHUGLlIsSr1L7voCE3lEmj3DOSFqJ0Kq  |  49191 | Joan Stein
              43 | 2006-01-24 | NV        | EJAN2JjRqoQWgp7rHLT1yPMBR50g1Kil3klu1vPritFKB2
      5EjmL1tLXleagAP |  30179 | William Strassel  | 2006-03-02 | MI        | W9G0nWo8QNtH
      r9YxOscigPbtXEtNPZ                                  |  44849 | Catherine Turner
             193 | 1990-01-14 | NV        | 9nd3po1bnyasqINVA
                      |  47775 | James Walters
      ...
      1990-01-14 | NV        | 9nd3po1bnyasqINVA
                      |  47775 | James Walters     | 1980-04-22 | ID        | jR8jr1lqDprU
      FPhAX4xZnulndYNd3                                   |   5876 | Rosie Johnson
               5 | 2004-01-27 | KS        | 0A8Gwqm66k6wQ1KzcUdSQKZU3AZtPImxb8
                      |  57787 | Dean Salazar      | 1997-09-13 | SC        | uq35Sqf1GfPt
      IV1mE2CzwxKaX                                       |  17041 | Dorothy Paulsen
               5 | 1999-07-12 | UT        | hQk9DBx1egLNIpi9btvN7GPewgvPROWaNArsxAbRILW3dN
      fwi526          |  38130 | Beverly Flores    | 1997-09-13 | SC        | uq35Sqf1GfPt
      IV1mE2CzwxKaX                                       |  17041 | Dorothy Paulsen
      (239 rows)                
      

      Attachments

        1. using_f2.parquet
          21 kB
          Khurram Faraaz
        2. using_f1.parquet
          3 kB
          Khurram Faraaz

        Activity

          People

            Unassigned Unassigned
            khfaraaz Khurram Faraaz
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: