Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-411

Relax ProjectRelbase restriction on duplicate names

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Not A Problem
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      ProjectRelBase:180

      !Util.isDistinct(rowType.getFieldNames()
      

      disallows duplicate fieldNames.

      But this the following is allowed in mysql, postgres and hive

      create table t1(x int, y int);
      select x,x from t1;
      

      Can optiq relax this check?

        Activity

        Hide
        julianhyde Julian Hyde added a comment - - edited

        Optiq supports duplicate aliases in SQL too (see https://github.com/julianhyde/optiq/commit/f7158805013c03f9e0252cb7808c26358a5c111c). But it does it without duplicate names in RelNode records. That would make this case easier, but some other things more difficult.

        Optiq's JDBC driver gets the column names from the validated AST. Could Hive do the same?

        It's also possible that some other properties of the output columns, e.g. their precise type, change during the optimization process. That's another reason to keep the logical type info around.

        By the way, at one point I considered banning duplicate expressions (regardless of their names) in "ProjectRel($0, $0)" because they are usually wasteful and confusing to other rules. But I relented because you sometimes need 'select a, a ...' at the top level and in the child of a union.

        Show
        julianhyde Julian Hyde added a comment - - edited Optiq supports duplicate aliases in SQL too (see https://github.com/julianhyde/optiq/commit/f7158805013c03f9e0252cb7808c26358a5c111c ). But it does it without duplicate names in RelNode records. That would make this case easier, but some other things more difficult. Optiq's JDBC driver gets the column names from the validated AST. Could Hive do the same? It's also possible that some other properties of the output columns, e.g. their precise type, change during the optimization process. That's another reason to keep the logical type info around. By the way, at one point I considered banning duplicate expressions (regardless of their names) in "ProjectRel($0, $0)" because they are usually wasteful and confusing to other rules. But I relented because you sometimes need 'select a, a ...' at the top level and in the child of a union.
        Hide
        rhbutani Harish Butani added a comment -

        Makes sense.

        I don't know if Hive JDBC can get names from the AST. There maybe a way using the SchemaReader in Hive, will follow up.

        But this is not an Optiq issue. Thanks for checking.

        Show
        rhbutani Harish Butani added a comment - Makes sense. I don't know if Hive JDBC can get names from the AST. There maybe a way using the SchemaReader in Hive, will follow up. But this is not an Optiq issue. Thanks for checking.

          People

          • Assignee:
            julianhyde Julian Hyde
            Reporter:
            rhbutani Harish Butani
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development