Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
Impala 2.8.0
Description
This increases the severity of a pre-existing problem, where UDFs are always assumed to be deterministics, so UDFs with only constant arguments were cached or constant folded. In most cases in Impala 2.7.0, this had no effect, e.g. both f and g in f() + g() were re-evaluated for each input row.
The below commit added caching of constant arguments to ScalarFnCall expressions (used for UDFs, builtin functions and various operators), so f() and g() would not be re-evaluated for each input row.
commit 10fa472fa6aa036be02748ae54daed1722449c68 Author: Tim Armstrong <tarmstrong@cloudera.com> Date: Wed Oct 26 10:55:23 2016 -0700 IMPALA-4302,IMPALA-2379: constant expr arg fixes
The ideal solution is to provide syntax for UDF declarations that specifies whether it is deterministic. As a short-term workaround we could add a query option that assumes that all UDFs are non-deterministic.