Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-546

CREATE VIEW not compatible with SELECT * + JOIN + USING

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • Impala 1.1
    • Impala 1.2
    • None
    • None

    Description

      If I issue CREATE VIEW that selects all columns from a JOIN...USING query, it fails with an error because the result set includes the identically named column(s) from both tables:

      [localhost:21000] > create table j1 (x int, s string);
      Query: create table j1 (x int, s string)
      [localhost:21000] > create table j2 (x int, a string);
      Query: create table j2 (x int, a string)
      [localhost:21000] > create view join_view as select * from j1 join j2 using ;
      Query: create view join_view as select * from j1 join j2 using
      ERROR: MetaException: javax.jdo.JDODataStoreException: Add request failed : INSERT INTO `COLUMNS_V2` (`CD_ID`,`COMMENT`,`COLUMN_NAME`,`TYPE_NAME`,`INTEGER_IDX`) VALUES (?,?,?,?,?)
      NestedThrowables:
      java.sql.BatchUpdateException: Duplicate entry '87-x' for key 'PRIMARY'

      If this is a desired limitation, we should have a friendly error message rather than propagating what looks like a metastore error in a JDBC call.

      This one can be worked around by selecting * from one table and only selected columns from the other table:

      [localhost:21000] > create view join_view as select j1.*, j2.a from j1 join j2 using ;
      Query: create view join_view as select j1.*, j2.a from j1 join j2 using
      [localhost:21000] >

      I was originally trying to see if CREATE VIEW could help in some cases where people are asking for CREATE TABLE AS SELECT functionality, where they mention the source tables can have hundreds of columns. So this workaround might not be practical in such use cases.

      Since the error is caused by a result set containing columns with identical names and values, perhaps CREATE VIEW could do something behind the scenes to remove the duplicated column(s), allowing the original SELECT * query to succeed. (I don't know if that violates some subtle rule related to the USING clause – I'm curious whether the same syntax would work in MySQL, Oracle, and PostgreSQL.)

      Attachments

        Activity

          People

            alex.behm Alexander Behm
            jrussell John Russell
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: