Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-31936

Implement ScriptTransform in sql/core

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Implemented
    • 3.0.0, 3.1.0, 3.2.0
    • 3.2.0
    • SQL
    • None

    Description

      ScriptTransformation currently relies on Hive internals. It'd be great if we can implement a native ScriptTransformation in sql/core module to remove the extra Hive dependency here.

      Attachments

        1.
        Spark can’t support TRANSFORM with aggregation Sub-task Resolved angerszhu
        2.
        Support processing array/map/struct type using spark noserde mode Sub-task Resolved angerszhu
        3.
        To perfect using constraints of script transform Sub-task Resolved Unassigned
        4.
        Solve string value error about Date/Timestamp in ScriptTransform Sub-task Resolved Unassigned
        5.
        Refactor current script transform code Sub-task Resolved angerszhu
        6.
        Implement script transform in sql/core Sub-task Resolved angerszhu
        7.
        TRANSFORM when schema less should keep same with hive Sub-task Resolved angerszhu
        8.
        TRANSFORM with hive serde support CalendarIntervalType and UserDefinedType Sub-task Resolved Unassigned
        9.
        Test coverage of HiveScripTransformationExec Sub-task Resolved angerszhu
        10.
        SCRIP TRANSFORM Extract common method from process row to avoid repeated judgement Sub-task Resolved angerszhu
        11.
        Catch and adding excetion about HiveInspector can't converted data type Sub-task Resolved Unassigned
        12.
        Script Transformation no-serde `TOK_TABLEROWFORMATLINES` only support `\n` Sub-task Resolved angerszhu
        13.
        Script Transform DELIMIT value should be formatted Sub-task Resolved angerszhu
        14.
        Scrip transformation no-serde mode when column less then output length , Use null fill Sub-task Resolved angerszhu
        15.
        update "no-serde" in the codebase in other TRANSFORM PRs. Sub-task Closed Unassigned
        16.
        Add a test case for hive serde/default-serde mode's null value '\\N' Sub-task Resolved angerszhu
        17.
        Script transform hive serde default field.delimit is '\t' Sub-task Resolved angerszhu
        18.
        Add a configuration to control the legacy behavior of whether need to pad null value when value size less then schema size Sub-task Resolved angerszhu
        19.
        Spark SQL no serde row format field delimit default is '\u0001' Sub-task Resolved angerszhu
        20.
        Support user-defined script command wrapper for more use case Sub-task Resolved angerszhu
        21.
        Add a dedicated SQL document page for the TRANSFORM-related functionality, Sub-task Resolved angerszhu
        22.
        Transform with clusterby/orderby/sortby Sub-task Resolved angerszhu
        23.
        Refactor ScriptTransformation to remove input parameter and replace it by child.output Sub-task Resolved angerszhu
        24.
        TRANSFORM forbiden DISTINCT/ALL and make the error message clear Sub-task Resolved angerszhu
        25.
        TRANSFORM should not support ALIAS in input expression seq Sub-task Resolved angerszhu
        26.
        extract doc of hive format Sub-task Resolved angerszhu
        27.
        DayTimeIntervalType/YearMonthIntervalString show different in hive serde and row formet delimited Sub-task Resolved angerszhu
        28.
        Add SQL doc about transform for current behavior Sub-task Resolved angerszhu
        29.
        Row FORMAT SERDE should handle null value as `\N` Sub-task Resolved angerszhu

        Activity

          People

            angerszhuuu angerszhu
            angerszhuuu angerszhu
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: