Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-14118

Implement DDL/DML commands for Spark 2.0

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • None
    • 2.0.0
    • SQL
    • None

    Description

      Right now, we have many DDL/DML commands that are passed to Hive, which may cause missing functionality, failures with bad error messages, or inconsistent behaviors (e.g. a command that works with some cases but fails for other cases). For Spark 2.0, it will be great to not ask Hive to process those DDL/DML commands.

      You can find the doc at https://issues.apache.org/jira/secure/attachment/12793435/Implementing%20native%20DDL%20and%20DML%20statements%20for%20Spark%202.pdf (under SPARK-13879).

      There are mainly two kinds of commands, (1) Native, i.e. we want to have native implementation in Spark; (2) Exception, i.e. we should throw an exception. That doc has a few commands that are marked as TBD. We should first throw exceptions for them.

      Sub-tasks are created based on the doc. A command is represented by its corresponding Token.

      Attachments

        Issue Links

          1.
          Decide DDL/DML commands that need Spark native implementation in 2.0 Sub-task Resolved Yin Huai
          2.
          Role management commands (Exception) Sub-task Resolved Andrew Or
          3.
          Import/Export commands (Exception) Sub-task Resolved Andrew Or
          4.
          Show commands (Native) Sub-task Resolved Dilip Biswal
          5.
          Show commands (Exception) Sub-task Resolved Andrew Or
          6.
          Function related commands Sub-task Resolved L. C. Hsieh
          7.
          Database related commands Sub-task Resolved Xiao Li
          8.
          View related commands Sub-task Resolved Xiao Li
          9.
          [Table related commands] Truncate table Sub-task Resolved Lianhui Wang
          10.
          [Table related commands] Describe table Sub-task Resolved Cheng Lian
          11.
          [Table related commands] For a table related commands, it should be able to distinguish data source tables and hive tables Sub-task Resolved Andrew Or
          12.
          [Table related commands] Alter table Sub-task Resolved Andrew Or
          13.
          [Table related commands] Alter column Sub-task Resolved Yin Huai
          14.
          [Table related commands] Alter partition Sub-task Resolved Andrew Or
          15.
          [Table related commands] Others Sub-task Resolved Suresh Thalamati
          16.
          Describe function command returns wrong output because some of built-in functions are not in function registry. Sub-task Resolved Yong Tang
          17.
          SHOW CREATE TABLE command (Native) Sub-task Resolved Cheng Lian
          18.
          Remove alterFunction from SessionCatalog Sub-task Resolved Yin Huai
          19.
          Create Table Sub-task Resolved Andrew Or
          20.
          Drop Table Sub-task Resolved Xiao Li
          21.
          For data source tables, we should not allow users to set/change partition locations Sub-task Resolved Andrew Or
          22.
          Make error messages consistent across DDLs Sub-task Resolved Andrew Or
          23.
          Consolidate DDL tests Sub-task Resolved Andrew Or
          24.
          Add tests to make sure drop partitions of an external table will not delete data Sub-task Resolved Xiao Li
          25.
          HiveClientImpl's toHiveTable misses a table property for external tables Sub-task Resolved Yin Huai
          26.
          Decide if we should still support CREATE EXTERNAL TABLE AS SELECT Sub-task Resolved Yin Huai
          27.
          Create table like Sub-task Resolved L. C. Hsieh
          28.
          SessionCatalog needs to check if a metadata operation is valid Sub-task Resolved Xiao Li
          29.
          Show columns/partitions Sub-task Resolved Dilip Biswal
          30.
          LOAD DATA Sub-task Resolved L. C. Hsieh
          31.
          When creating a view, we should verify both the input SQL and the generated SQL Sub-task Resolved Reynold Xin
          32.
          Restructure commands.scala Sub-task Resolved Reynold Xin
          33.
          Native DDL Command Support for Describe Function in Non-identifier Format Sub-task Resolved Xiao Li
          34.
          Add PARTITIONED BY and CLUSTERED BY clause for data source CTAS syntax Sub-task Resolved Cheng Lian
          35.
          Use the value of spark.sql.warehouse.dir as the warehouse location instead of using hive.metastore.warehouse.dir Sub-task Resolved Yin Huai
          36.
          SessionCatalog needs to set the location for default DB Sub-task Resolved Yin Huai
          37.
          When creating a database, we need to qualify its path Sub-task Resolved Yin Huai
          38.
          Support creating temporary views with DDL Sub-task Resolved Xiang Zhong
          39.
          For a data source table, Describe table needs to handle spark.sql.sources.schema Sub-task Resolved Xiang Zhong
          40.
          Reset Command Sub-task Resolved Xiao Li

          Activity

            People

              yhuai Yin Huai
              yhuai Yin Huai
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: