Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-28375

Enforce idempotence on the PullupCorrelatedPredicates optimizer rule

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.2.0, 2.3.0, 2.4.0, 3.0.0
    • 3.0.0
    • SQL

    Description

      The current PullupCorrelatedPredicates implementation can accidentally remove predicates for multiple runs.

      For example, for the following logical plan, one more optimizer run can remove the predicate in the SubqueryExpresssion.

      # Optimized
      Project [a#0]
      +- Filter a#0 IN (list#4 [(b#1 < d#3)])
         :  +- Project [c#2, d#3]
         :     +- LocalRelation <empty>, [c#2, d#3]
         +- LocalRelation <empty>, [a#0, b#1]
      
      # Double optimized
      Project [a#0]
      +- Filter a#0 IN (list#4 [])
         :  +- Project [c#2, d#3]
         :     +- LocalRelation <empty>, [c#2, d#3]
         +- LocalRelation <empty>, [a#0, b#1]
      

       

       

      Attachments

        Issue Links

          Activity

            People

              dkbiswal Dilip Biswal
              manifoldQAQ Yesheng Ma
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: