Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
4.0.0
Description
I propose to add a Variant data type in Spark. It is used to efficiently represent semi-structured values without a user-specified schema. Currently, many users are depending on JSON expressions to handle JSON data, which can often lead to repeated JSON parsing and degraded performance. One of the major goals of the Variant type is to use a more efficient binary representation internally and avoid repeated JSON parsing. At the same time, it keeps the flexibility of schemaless JSON data.
Attachments
Issue Links
1.
|
Add Golden Table Tests for Variant from different engines | Open | Unassigned | |
2.
|
Functions to shred a Variant into components | Open | Unassigned | |
3.
|
Add support for interval types in the Variant spec | Open | Unassigned | |
4.
|
Add variant metrics to JSON Scan nodes and Project nodes containing variant-constructor expressions | Open | Unassigned |