Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-38334

Implement support for DEFAULT values for columns in tables

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.4.0
    • SQL

    Description

      This story tracks the implementation of DEFAULT values for columns in tables.

      CREATE TABLE and ALTER TABLE invocations will support setting column default values for future operations. Following INSERT, UPDATE, MERGE statements may then reference the value using the DEFAULT keyword as needed.

      Examples:

      CREATE TABLE T(a INT, b INT NOT NULL);
      
      -- The default default is NULL
      INSERT INTO T VALUES (DEFAULT, 0);
      INSERT INTO T(b)  VALUES (1);
      SELECT * FROM T;
      (NULL, 0)
      (NULL, 1)
      
      -- Adding a default to a table with rows, sets the values for the
      -- existing rows (exist default) and new rows (current default).
      ALTER TABLE T ADD COLUMN c INT DEFAULT 5;
      INSERT INTO T VALUES (1, 2, DEFAULT);
      SELECT * FROM T;
      (NULL, 0, 5)
      (NULL, 1, 5)
      (1, 2, 5) 

      Attachments

        Issue Links

          1.
          Catalyst changes for DEFAULT column support Sub-task Resolved Daniel
          2.
          Parser changes for DEFAULT column support Sub-task Resolved Daniel
          3.
          Support INSERT INTO user specified column lists with DEFAULT values Sub-task Resolved Daniel
          4.
          Support ALTER TABLE ADD COLUMN commands with DEFAULT values Sub-task Resolved Daniel
          5.
          Support V2 data sources with DEFAULT values Sub-task Resolved Daniel
          6.
          Support ALTER TABLE ALTER COLUMN commands with DEFAULT values Sub-task Resolved Daniel
          7.
          Respect Table capability `ACCEPT_ANY_SCHEMA` in default column resolution Sub-task Resolved Daniel
          8.
          Support UPDATE commands with DEFAULT values Sub-task Resolved Daniel
          9.
          Support MERGE commands with DEFAULT values Sub-task Resolved Daniel
          10.
          Disable DEFAULT column SQLConf until implementation is complete Sub-task Resolved Unassigned
          11.
          Support CSV file scans with DEFAULT values Sub-task Resolved Daniel
          12.
          Support JSON file scans with default values Sub-task Resolved Daniel
          13.
          Support Avro file scans with DEFAULT values Sub-task Resolved Unassigned
          14.
          Support Parquet file scans with DEFAULT values Sub-task Resolved Daniel
          15.
          Support Orc file scans with DEFAULT values Sub-task Resolved Daniel
          16.
          Restrict DEFAULT columns to allowlist of supported data source types Sub-task Resolved Daniel
          17.
          Support ARRAY, STRUCT, MAP types as DEFAULT values Sub-task Resolved Daniel
          18.
          Prohibit subquery expressions in DEFAULT values for now Sub-task Resolved Daniel
          19.
          Fix bug in ARRAY, STRUCT, MAP types with DEFAULT values with NULL field(s) Sub-task Resolved Daniel
          20.
          Prohibit UDF expressions in DEFAULT values for now Sub-task Resolved Unassigned
          21.
          Restrict adding DEFAULT columns for existing tables to allowlist of supported data source types Sub-task Resolved Daniel
          22.
          Fix bug in existence DEFAULT value lookups for V2 data sources Sub-task Resolved Daniel
          23.
          Fix bug in existence DEFAULT value lookups for non-vectorized Parquet scans Sub-task Resolved Daniel
          24.
          Test DEFAULT column values with DataFrames Sub-task Resolved Daniel
          25.
          Add config to toggle whether to automatically add default values for INSERTs without user-specified fields Sub-task Resolved Unassigned
          26.
          Add config to make DEFAULT values in JSON tables mutually exclusive with SQLConf.JSON_GENERATOR_IGNORE_NULL_FIELDS Sub-task Resolved Daniel
          27.
          Fix a correctness bug in existence DEFAULT value lookups for the Orc data source Sub-task Resolved Daniel
          28.
          Include column default values in DESCRIBE output for V1 tables Sub-task Resolved Daniel
          29.
          Include column default values in DESCRIBE output for V2 tables Sub-task Resolved Daniel
          30.
          Add NULL values for INSERT commands with user-specified lists of fewer columns than the target table Sub-task Resolved Daniel
          31.
          Fix bug with timestamp literals Sub-task Resolved Daniel
          32.
          Support SELECT DEFAULT with ORDER BY, LIMIT, OFFSET for INSERT source relation Sub-task Resolved Daniel
          33.
          Fix bug in column DEFAULT assignment for target tables with multi-part names Sub-task Resolved Daniel
          34.
          Fix bug when querying from table after changing defaults Sub-task Resolved Daniel
          35.
          Run optimizer on CREATE TABLE column defaults Sub-task Resolved Daniel
          36.
          Run optimizer on REPLACE TABLE column defaults Sub-task Open Unassigned
          37.
          Include coverage of JSON data sources in array/struct/map default value tests Sub-task Resolved Mark Jarvin
          38.
          ALTER COLUMN DROP DEFAULT test fails with JSON data sources Sub-task Open Unassigned

          Activity

            People

              dtenedor Daniel
              dtenedor Daniel
              Gengliang Wang Gengliang Wang
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: