[SPARK-25650] Make analyzer rules used in once-policy idempotent - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Task
Status: Resolved
Priority: Major
Resolution: Done
Affects Version/s: 2.3.2
Fix Version/s: None
Component/s: SQL
Labels:
None

Description

Rules like HandleNullInputsForUDF (https://issues.apache.org/jira/browse/SPARK-24891) do not stabilize (can apply new changes to a plan indefinitely) and can cause problems like SQL cache mismatching.
Ideally, all rules whether in a once-policy batch or a fixed-point-policy batch should stabilize after the number of runs specified. Once-policy should be considered a performance improvement, a assumption that the rule can stabilize after just one run rather than an assumption that the rule won't be applied more than once. Those once-policy rules should be able to run fine with fixed-point policy rule as well.
Currently we already have a check for fixed-point and throws an exception if maximum number of runs is reached and the plan is still changing. Here, in this PR, a similar check is added for once-policy and throws an exception if the plan changes between the first run and the second run of a once-policy rule.

To reproduce this issue, go to https://github.com/apache/spark/pull/22060, apply the changes and remove the specific rule from the whitelist https://github.com/apache/spark/pull/22060/files#diff-f70523b948b7af21abddfa3ab7e1d7d6R71.

Attachments

Sub-Tasks

1.	Analyzer rule "HandleNullInputsForUDF" does not stabilize and can be applied infinitely		Resolved	Wei Xue
2.	Analyzer rule "AliasViewChild" does not stabilize		Resolved	Marco Gaido

Activity

People

Assignee:: Unassigned

Reporter:: Wei Xue

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 05/Oct/18 16:52

Updated:: 12/Dec/22 18:10

Resolved:: 14/Feb/19 01:22