Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-530 Phoenix support
  3. CALCITE-493

Add EXTEND clause, for defining columns and their types at query/DML time

    Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.0.0-incubating
    • Component/s: None
    • Labels:

      Description

      Often times only being able to define a static schema over which to query is not flexible enough. It would be useful to be able to define columns and their type at query/DML time. For an example syntax, see http://phoenix.apache.org/dynamic_columns.html

        Issue Links

          Activity

          Hide
          julianhyde Julian Hyde added a comment -

          It plays nicely with Calcite internals, because Calcite can just augment the definition of the table it read from the catalog before it starts validating and planning the query.

          And it’s nice for end-users, because they get to use strongly-typed columns in their query.

          It’s pretty close to standard SQL

          SELECT ... FROM t [ AS ] tableAlias (columnAlias1, columnAlias2, ...)
          

          Standard SQL allows you to omit AS but not tableAlias, so I think your proposed syntax is not ambiguous. However an extra keyword would be helpful. What would you (and Phoenix users) think of this syntax:

          SELECT ... FROM t EXTEND (c1 type1, c2 type2)
          

          It would occur before the table-alias and column-aliases, if present. The following are all valid:

          SELECT e.id, e.my_horoscope
          FROM emp EXTEND (horoscope VARCHAR(100))  AS e (id, name, deptno, my_horoscope);
          
          SELECT e.id, e.my_horoscope
          FROM emp EXTEND (horoscope VARCHAR(100)) e (id, name, deptno, my_horoscope);
          
          SELECT e.empno, e.horoscope
          FROM emp EXTEND (horoscope VARCHAR(100)) e;
          
          SELECT e.empno, e.horoscope
          FROM emp EXTEND (horoscope VARCHAR(100)) AS e;
          
          SELECT e.empno, e.horoscope
          FROM emp EXTEND (horoscope VARCHAR(100));
          

          Clearly, extended columns are only valid on base tables. (Not sub-queries, views, table functions, or relations defined in the WITH clause.)

          We should have a means for a table to say it is OK to ask for augmenting columns. I propose an extra method:

          interface Table {
            /** Returns a new Table containing all of the fields of this Table, plus the given fields. */
            Table augment(List<RelDataTypeField> fields);
          }
          

          Note if the table implements the expansion interfaces such as ScannableTable, FilterableTable, ProjectableFilterableTable, the object returned from augment may or may not implement those interfaces too.

          Show
          julianhyde Julian Hyde added a comment - It plays nicely with Calcite internals, because Calcite can just augment the definition of the table it read from the catalog before it starts validating and planning the query. And it’s nice for end-users, because they get to use strongly-typed columns in their query. It’s pretty close to standard SQL SELECT ... FROM t [ AS ] tableAlias (columnAlias1, columnAlias2, ...) Standard SQL allows you to omit AS but not tableAlias , so I think your proposed syntax is not ambiguous. However an extra keyword would be helpful. What would you (and Phoenix users) think of this syntax: SELECT ... FROM t EXTEND (c1 type1, c2 type2) It would occur before the table-alias and column-aliases, if present. The following are all valid: SELECT e.id, e.my_horoscope FROM emp EXTEND (horoscope VARCHAR(100)) AS e (id, name, deptno, my_horoscope); SELECT e.id, e.my_horoscope FROM emp EXTEND (horoscope VARCHAR(100)) e (id, name, deptno, my_horoscope); SELECT e.empno, e.horoscope FROM emp EXTEND (horoscope VARCHAR(100)) e; SELECT e.empno, e.horoscope FROM emp EXTEND (horoscope VARCHAR(100)) AS e; SELECT e.empno, e.horoscope FROM emp EXTEND (horoscope VARCHAR(100)); Clearly, extended columns are only valid on base tables. (Not sub-queries, views, table functions, or relations defined in the WITH clause.) We should have a means for a table to say it is OK to ask for augmenting columns. I propose an extra method: interface Table { /** Returns a new Table containing all of the fields of this Table, plus the given fields. */ Table augment(List<RelDataTypeField> fields); } Note if the table implements the expansion interfaces such as ScannableTable , FilterableTable , ProjectableFilterableTable , the object returned from augment may or may not implement those interfaces too.
          Hide
          julianhyde Julian Hyde added a comment -

          Fixed in http://git-wip-us.apache.org/repos/asf/incubator-calcite/commit/ecc5f4c2.

          Rather than adding a method to Table, I created a new interface ExtensibleTable extends Table with a method Table extend(List<RelDataTypeField>). If an adapter supports extensions for hidden columns, then you should make your Table objects implement ExtensibleTable.

          Show
          julianhyde Julian Hyde added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/incubator-calcite/commit/ecc5f4c2 . Rather than adding a method to Table , I created a new interface ExtensibleTable extends Table with a method Table extend(List<RelDataTypeField>) . If an adapter supports extensions for hidden columns, then you should make your Table objects implement ExtensibleTable .
          Hide
          jamestaylor James Taylor added a comment -

          +1. This is very nice, Julian Hyde.

          Show
          jamestaylor James Taylor added a comment - +1. This is very nice, Julian Hyde .
          Hide
          vladimirsitnikov Vladimir Sitnikov added a comment -

          Julian Hyde, does it work for table functions?

          Show
          vladimirsitnikov Vladimir Sitnikov added a comment - Julian Hyde , does it work for table functions?
          Hide
          julianhyde Julian Hyde added a comment -

          Not at present. It could be extended to handle table functions, if there's a need for it. The hardest part would be extending the TableFunction interface to ask for the extra columns (analogous to how we added ExtensibleTable extends Table). I'd like to see a real use case before I jump in and do it.

          By the way, TableMacro could be made extensible too, and with less effort than TableFunction; its apply method would just need to return a TranslatableTable that also implements ExtensibleTable.

          Show
          julianhyde Julian Hyde added a comment - Not at present. It could be extended to handle table functions, if there's a need for it. The hardest part would be extending the TableFunction interface to ask for the extra columns (analogous to how we added ExtensibleTable extends Table). I'd like to see a real use case before I jump in and do it. By the way, TableMacro could be made extensible too, and with less effort than TableFunction; its apply method would just need to return a TranslatableTable that also implements ExtensibleTable.
          Hide
          vladimirsitnikov Vladimir Sitnikov added a comment -

          Julian Hyde,
          How does this feature interact with _extra field in RelDataType?
          Can we/should we drop _extra in favour of extend clause?

          Show
          vladimirsitnikov Vladimir Sitnikov added a comment - Julian Hyde , How does this feature interact with _extra field in RelDataType ? Can we/should we drop _extra in favour of extend clause?
          Hide
          jamestaylor James Taylor added a comment -

          This looks very good for SELECT, but how would you insert a dynamic column value at DML time using INSERT/UPSERT VALUES, Julian Hyde? For example, in Phoenix you can do this:

          UPSERT INTO T(a, b, c VARCHAR) VALUES ('foo','bar','bas');
          UPSERT INTO T(a, b, c VARCHAR) SELECT d, e, f FROM OTHER_T;
          
          Show
          jamestaylor James Taylor added a comment - This looks very good for SELECT, but how would you insert a dynamic column value at DML time using INSERT/UPSERT VALUES, Julian Hyde ? For example, in Phoenix you can do this: UPSERT INTO T(a, b, c VARCHAR) VALUES ('foo','bar','bas'); UPSERT INTO T(a, b, c VARCHAR) SELECT d, e, f FROM OTHER_T;
          Hide
          julianhyde Julian Hyde added a comment -

          I haven't tried it, but any table reference can have EXTEND, not just those in the FROM clause of a SELECT. Thus

          UPSERT INTO T EXTEND (c VARCHAR) (a, b, c) VALUES ('foo','bar','bas');
          UPSERT INTO T EXTEND (c VARCHAR) (a, b, c) SELECT d, e, f FROM OTHER_T;
          

          Yeah, it ain't pretty.

          Show
          julianhyde Julian Hyde added a comment - I haven't tried it, but any table reference can have EXTEND, not just those in the FROM clause of a SELECT. Thus UPSERT INTO T EXTEND (c VARCHAR) (a, b, c) VALUES ('foo','bar','bas'); UPSERT INTO T EXTEND (c VARCHAR) (a, b, c) SELECT d, e, f FROM OTHER_T; Yeah, it ain't pretty.
          Hide
          jamestaylor James Taylor added a comment -

          Ah, ok. This is fine then.

          Show
          jamestaylor James Taylor added a comment - Ah, ok. This is fine then.
          Hide
          julianhyde Julian Hyde added a comment -

          Closing now that 1.0.0-incubating has been released.

          Show
          julianhyde Julian Hyde added a comment - Closing now that 1.0.0-incubating has been released.

            People

            • Assignee:
              julianhyde Julian Hyde
              Reporter:
              jamestaylor James Taylor
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development