Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-1635

Allow selecting non-qualified col ref from equijoin using clause

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • Impala 2.1
    • None
    • Frontend

    Description

      In order to support a dplyr source for Impala (see https://github.com/piersharding/dplyrimpaladb and https://github.com/hadley/dplyr/issues/383#issuecomment-46007851), we should add support for selecting a non-qualified column ref when there's a equijoin with a USING clause on that column.

      For example, the following query fails today:

      select `hofid`, `playerid` FROM
           (SELECT `playerid`, `hofid`, FROM `Master`) as t1
           INNER JOIN
           (SELECT `hofid` , `votedby` FROM `HallOfFame` WHERE `inducted` = 'Y')  as t2
      USING (`hofid`)
      LIMIT 10;
      
      ERROR: AnalysisException: Unqualified column reference 'hofid' is ambiguous
      

      At least PostgreSQL do support this. We'd have to pick one of the two columns to return. If there's a cast involved, we need to decide how to pick one. E.g. if hofid is an INT on one table and a BIGINT on the other. In a simple test it looks to me like PostgreSQL picked the larger type. We should also find out of this is actually part of the SQL spec or if PostgreSQL (and other DBs, I haven't checked) just happen to support this.

      Attachments

        Activity

          People

            Unassigned Unassigned
            hammer Jeff Hammerbacher
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: