Description
The code to collect ExtractValue expressions is wrong. We should do it in a bottom up way instead of only check 2 levels. It can cause incorrect result if the expression looks like ExtractValue(ExtractValue(some_other_expr)).
An example to trigger the bug is:
input: <col1: array<struct<a: int, b: struct<a: struct<a: int, b: int>, b: int>>>>
Project(ExtractValue(ExtractValue(CaseWhen([col.a == 1, col.b]), "a"), "a")
- Generate(Explode(col1))
We will try to incorrectly push down the whole expression into the input of the Explode, now the input of CaseWhen has array<...> as input so we will get wrong result.