[FLINK-16627] Support only generate non-null values when serializing into JSON - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Not a Priority
Resolution: Fixed
Affects Version/s: 1.10.0
Fix Version/s: 1.20.0
Component/s: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table SQL / Planner
Labels:

Description

//sql
CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……）

//sql
CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……）

//scala udf
class ScalaUpper extends ScalarFunction {    
def eval(str: String) : String= { 
       if(str == null){
           return ""
       }else{
           return str
       }
    }
    
}
btenv.registerFunction("scala_upper", new ScalaUpper())

//sql
insert into sink_kafka select subtype, scala_upper(svt)  from source_kafka

Sometimes the svt's value is null, inert into kafkas json like {"subtype":"qin","svt":null}

If the amount of data is small, it is acceptable，but we process 10TB of data every day, and there may be many nulls in the json, which affects the efficiency. If you can add a parameter to remove the null key when defining a sinktable, the performance will be greatly improved

Attachments

Issue Links

links to

GitHub Pull Request #18310

GitHub Pull Request #24430

Activity

People

Assignee:: yisha zhou

Reporter:: jackray wang

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 17/Mar/20 05:03

Updated:: 06/Mar/24 03:36

Resolved:: 05/Mar/24 15:25