Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Won't Fix
-
2.3.1
-
None
-
None
Description
Currently, the from_json() function accepts only string literals as schema:
- Checking of schema argument inside of JsonToStructs: https://github.com/apache/spark/blob/b8f27ae3b34134a01998b77db4b7935e7f82a4fe/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala#L530
- Accepting only string literal: https://github.com/apache/spark/blob/b8f27ae3b34134a01998b77db4b7935e7f82a4fe/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala#L749-L752
JsonToStructs should be modified to accept results of aggregate functions like infer_schema (see SPARK-24642). It should be possible to write SQL like:
select from_json(json_col, infer_schema(json_col)) from json_table
Here is a test case with existing aggregate function - first():
create temporary view schemas(schema) as select * from values ('struct<a:int>'), ('map<string,int>'); select from_json('{"a":1}', first(schema)) from schemas;
Attachments
Issue Links
- is related to
-
SPARK-24642 Add a function which infers schema from a JSON column
- Resolved