Spark SQL has both CASE WHEN and IF expressions.
I've seen many cases where end-users write
because Spark doesn't have a org.apache.spark.sql.functions._ method for the If expression.
Unfortunately, CASE WHEN generates substantial code bloat because its codgen is implemented using a do-while loop. In some performance-critical frameworks, I've modified our code to directly construct the Catalyst If expression, but this is toilsome and confusing to end-users.
If we have a CASE WHEN which has only two branches, like the example given above, then Spark should automatically rewrite it into a simple IF expression.
- links to