Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
3.0.0
-
None
-
None
Description
I propose the following expression rewrite optimizations:
coalesce(x: Boolean, true) -> x or isnull(x) coalesce(x: Boolean, false) -> x and isnotnull(x)
This pattern appears when translating Dataset filters on Option[Boolean] columns: we might have a typed Dataset filter which looks like
.filter(_.boolCol.getOrElse(DEFAULT_VALUE))
and the most idiomatic, user-friendly translation of this in Catalyst is to use coalesce(). However, the coalesce() form of this expression is not eligible for Parquet / data source filter pushdown.
(We should write out truth-tables to double-check this rewrite's correctness)
Attachments
Attachments
Issue Links
- links to