Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.5.2
Description
The `OptimizeCsvJsonExprs` rule can potentially change the schema of the underlying `StructField` if there are differences in the field used to access the struct vs the field in the underlying struct.
This surfaces as a correctness issue where instead of picking the values for the corresponding column we end up returning NULL.
A simple example query is:
SELECT from_json('[{"a": '||id||', "b": '|| (2*id) ||'}]', 'array<struct<a: INT, b: INT>>').a, from_json('[{"a": '||id||', "b": '|| (2*id) ||'}]', 'array<struct<a: INT, b: INT>>').A FROM range(3) as t
Here, the result is `[0], [1], [2]` for `a` but `[null], [null], [null]` for `A`. Since struct field accessor is case-insensitive, the result should had been `[0], [1], [2]` for both.
Attachments
Issue Links
- links to