Details
-
Task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.5.0, 4.0.0, 3.5.1, 3.5.2, 3.5.3
Description
Consider the following query:
```
val result = inputData.toDF()
.select("_1", "_2")
.withColumn("timestamp", to_timestamp($"_2", "yyyy-MM-dd HH:mm:ss"))
.withWatermark("timestamp", "24 hours")
.dropDuplicatesWithinWatermark("timestamp")
.select("_1")[]
```
Currently, the ColumnPruning optimization will prune the `timestamp` column since it is not selected in the final Project, leading to a `java.util.NoSuchElementException` when we try to get the event time column in DeduplicateWithinWatermarkExec.
We need to update the references for the DeduplicateWithinWatermark logical plan node so that the event time column is included in the references.