Uploaded image for project: 'Apache AsterixDB'
  1. Apache AsterixDB
  2. ASTERIXDB-1226

Support correlations in the SQL++ core of group-by

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: *DB - AsterixDB
    • Labels:
      None

      Description

      The current group-by semantics for SQL++ loses the correlation that exists
      in an incoming tuple.

      We should support binding a group to a variable as the core of SQL++.
      The group-by syntax should be:
      GroupbyClause ::= <GROUP> <BY> ( Expression ( ( <AS> )? Variable )? ( <COMMA> Expression ( ( <AS> )? Variable )? )* )
      (<GROUP> <AS> Variable <WITH> ( Expression ( ( <AS> )? Identifier )? ( <COMMA> Expression ( ( <AS> )? Identifier )? )* ) )?

      This is an example of the SQL++ core version of GROUP BY:
      Q1: SELECT deptId AS deptId,
      SUM((SELECT ELEMENT p.e1.salary + p.b1.bonus FROM eb_pairs p)) AS totalCompensation
      FROM Employee e JOIN Bonus b ON e.job_category = b.job_category
      GROUP BY e.department_id as deptId GROUP AS eb_pairs WITH e AS e1, b AS b1

      A syntactic sugar of this query is:
      Q1 (sugar): SELECT deptId AS deptId,
      SUM(e1.salary + b1.bonus) AS totalCompensation
      FROM Employee e JOIN Bonus b ON e.job_category = b.job_category
      GROUP BY e.department_id as deptId

      The sugar gets rewritten into the core by replacing expressions in aggregation functions, projection expressions (non-subquery), or "FROM" clauses of subqueries, that contain free variables into subqueries and adding the GROUP AS binding.
      In the above query:
      "e1.salary + b1.bonus" is the expression to be rewritten, and it will become:
      (SELECT ELEMENT p.e1.salary + p.b1.bonus FROM eb_pairs p)
      and at the same time the GROUP AS binding is added:
      "GROUP AS eb_pairs WITH e AS e1, b AS b1"

      In another example syntactic sugar:
      Q2 (sugar): SELECT deptId AS deptId,
      (SELECT ELEMENT e.name FROM e as e ORDER BY e.salary LIMIT 3)
      FROM Employee e
      GROUP BY e.department_id as deptId

      The expression to be rewritten in Q2 (sugar) is e. The core version is:
      Q2: SELECT deptId AS deptId,
      (SELECT ELEMENT e1.name FROM
      (SELECT ELEMENT g.e1 FROM ep_pairs AS g) AS e1
      ORDER BY e1.salary
      LIMIT 3)
      FROM Employee e
      GROUP BY e.department_id as deptId GROUP AS eb_pairs WITH e AS e1

        Attachments

          Activity

            People

            • Assignee:
              buyingyi Yingyi Bu
              Reporter:
              buyingyi Yingyi Bu
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: