Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-48338

Sql Scripting support for Spark SQL

    XMLWordPrintableJSON

Details

    • Epic
    • Status: Reopened
    • Major
    • Resolution: Unresolved
    • 4.0.0
    • None
    • Spark Core
    • Sql Scripting

    Description

      Design doc for this feature is in attachment.

      High level example of Sql Script:

      ```
      BEGIN
      DECLARE c INT = 10;
      WHILE c > 0 DO
      INSERT INTO tscript VALUES (c);
      SET c = c - 1;
      END WHILE;
      END
      ```

      High level motivation behind this feature:
      SQL Scripting gives customers the ability to develop complex ETL and analysis entirely in SQL. Until now, customers have had to write verbose SQL statements or combine SQL + Python to efficiently write business logic. Coming from another system, customers have to choose whether or not they want to migrate to pyspark. Some customers end up not using Spark because of this gap. SQL Scripting is a key milestone towards enabling SQL practitioners to write sophisticated queries, without the need to use pyspark. Further, SQL Scripting is a necessary step towards support for SQL Stored Procedures, and along with SQL Variables (released) and Temp Tables (in progress), will allow for more seamless data warehouse migrations.

      Attachments

        1. [Design Doc] Sql Scripting - OSS.pdf
          128 kB
          David Milicevic
        2. Sql Scripting - OSS.odt
          33 kB
          Aleksandar Tomic

        Issue Links

          1.
          [M0] Parser support Sub-task Resolved David Milicevic
          2.
          [M0] Interpreter support Sub-task Resolved David Milicevic
          3.
          [M0] Support for exceptions thrown from parser/interpreter Sub-task Resolved Milan Dankovic
          4.
          [M0] Improve exceptions thrown from parser/interpreter Sub-task Resolved Unassigned
          5.
          [M0] Support for labels Sub-task Resolved David Milicevic
          6.
          [M0] Checks for variable declarations Sub-task Resolved David Milicevic
          7.
          [M0] Fix SET behavior for scripts Sub-task Resolved David Milicevic
          8.
          [M0] Support for IF ELSE statement Sub-task Resolved David Milicevic
          9.
          [M0] Support for WHILE statement Sub-task Resolved Momcilo Mrkaic
          10.
          [M0] Support for LEAVE statement Sub-task Resolved David Milicevic
          11.
          [M0] Spark Connect investigation Sub-task Resolved Unassigned
          12.
          [M0] Private documentation Sub-task Open Unassigned
          13.
          [M1] Support for CASE statement Sub-task Resolved Dusan Tisma
          14.
          [M1] Support for LOOP statement Sub-task Resolved Dusan Tisma
          15.
          [M1] Support for REPEAT statement Sub-task Resolved Dusan Tisma
          16.
          [M1] Support for ITERATE statement Sub-task Resolved David Milicevic
          17.
          [M1] Further exception improvements Sub-task Resolved Dusan Tisma
          18.
          [M1] SQL Script execution Sub-task Resolved Milan Dankovic
          19.
          [M1] Support for local variables Sub-task Open Unassigned
          20.
          [M1] Exception handling Sub-task Open Unassigned
          21.
          [M1] Support for SIGNAL statement Sub-task Open Unassigned
          22.
          [M1] Support for FOR statement Sub-task Resolved David Milicevic
          23.
          [M1] CASE statement improvements Sub-task Open Unassigned
          24.
          [M1] Support for PRINT/TRACE statement Sub-task Open Unassigned
          25.
          [M1] Unique label names Sub-task Resolved Milan Dankovic
          26.
          [M1] Public documentation Sub-task Open Unassigned
          27.
          [M1] Spark Connect improvements Sub-task Open Unassigned
          28.
          [M0] Testing and operational readiness Sub-task Open Unassigned
          29.
          [M1] Performance benchmark Sub-task Open Unassigned
          30.
          [M1] Multiple results API - sqlScript() Sub-task Open Unassigned
          31.
          [M1] Fix grammar allowing empty bodies for loops, IF and CASE Sub-task Resolved Dusan Tisma
          32.
          [M1] FOR statement investigate SparkPlan.executeToIterator optimization Sub-task Open Unassigned
          33.
          [M1] Fix bug where empty BEGIN END blocks throw error in loops Sub-task Resolved Dusan Tisma

          Activity

            People

              dbatomic Aleksandar Tomic
              dbatomic Aleksandar Tomic
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated: