Details

Bug

Status: Closed

Major

Resolution: Fixed

1.21.0

None

None
Description
Add option to RelBuilder to prevent it from merging projects. Currently if you call RelBuilder.project and the input is a Project it will merge the expressions. This is usually a good idea, but sometimes it creates very complex expressions. In extreme cases Calcite can run out of memory.
There is an existing method RelBuilder.shouldMergeProject() but by default it returns true, and in order to change it you have to subclass RelBuilder, which is not easy to do.
I propose to add a property RelBuilder.Config.mergeBloat, default 0, which would prevent creating a project that has more complexity than the two projects that went into it.
Example 1:
 Input 1: Project(a+b+c+d AS w, b+c+d+e AS x, c+d+e+f AS y, d+e+f+g AS z) (complexity 28), followed by
 Input 2: Project(w*x AS p, x*y AS q, y*z AS r) (complexity 9) creates
 Output: Project((a+b+c+d) * (b+c+d+e) AS p, (b+c+d+e) * (c+d+e+f) AS q, (c+d+e+f) * (d+e+f+g) AS r).
The expression "a+b+c+d" has complexity 7 (4 fields and 3 calls). Input 1 has complexity 28 (4 expressions, each of complexity 7); input 2 has complexity 9 (3 expressions, each with complexity 3). Output has complexity 45 (3 expressions, each with complexity 15 (8 fields and 7 calls)). 45 is greater than 37 (28 + 9), so this merge would not be allowed.
Example 2:
 Input 1: Project(a+b+c+d AS w, b+c+d+e AS x, c+d+e+f AS y, d+e+f+g AS z) (complexity 28), followed by
 Input 2: Project(w*x AS p, x*y AS q) (complexity 6) creates
 Output: Project((a+b+c+d) * (b+c+d+e) AS p, (b+c+d+e) * (c+d+e+f) AS q) (complexity 30).
Output complexity 30 is less than input complexity 34 (28 + 6), and therefore the merge is allowed.