Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21652

Optimizer cannot reach a fixed point on certain queries

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.2.0
    • 2.3.0
    • Optimizer, SQL
    • None

    Description

      The optimizer cannot reach a fixed point on the following query:

      Seq((1, 2)).toDF("col1", "col2").write.saveAsTable("t1")
      Seq(1, 2).toDF("col").write.saveAsTable("t2")
      spark.sql("SELECT * FROM t1, t2 WHERE t1.col1 = 1 AND 1 = t1.col2 AND t1.col1 = t2.col AND t1.col2 = t2.col").explain(true)
      

      At some point during the optimization, InferFiltersFromConstraints infers a new constraint '(col2#33 = col1#32)' that is appended to the join condition, then PushPredicateThroughJoin pushes it down, ConstantPropagation replaces '(col2#33 = col1#32)' with '1 = 1' based on other propagated constraints, ConstantFolding replaces '1 = 1' with 'true and BooleanSimplification finally removes this predicate. However, InferFiltersFromConstraints will again infer '(col2#33 = col1#32)' on the next iteration and the process will continue until the limit of iterations is reached.

      See below for more details

      === Applying Rule org.apache.spark.sql.catalyst.optimizer.InferFiltersFromConstraints ===
      !Join Inner, ((col1#32 = col#34) && (col2#33 = col#34))                                       Join Inner, ((col2#33 = col1#32) && ((col1#32 = col#34) && (col2#33 = col#34)))
       :- Filter ((isnotnull(col1#32) && isnotnull(col2#33)) && ((col1#32 = 1) && (1 = col2#33)))   :- Filter ((isnotnull(col1#32) && isnotnull(col2#33)) && ((col1#32 = 1) && (1 = col2#33)))
       :  +- Relation[col1#32,col2#33] parquet                                                      :  +- Relation[col1#32,col2#33] parquet
       +- Filter ((1 = col#34) && isnotnull(col#34))                                                +- Filter ((1 = col#34) && isnotnull(col#34))
          +- Relation[col#34] parquet                                                                  +- Relation[col#34] parquet
                      
      
      === Applying Rule org.apache.spark.sql.catalyst.optimizer.PushPredicateThroughJoin ===
      !Join Inner, ((col2#33 = col1#32) && ((col1#32 = col#34) && (col2#33 = col#34)))              Join Inner, ((col1#32 = col#34) && (col2#33 = col#34))
      !:- Filter ((isnotnull(col1#32) && isnotnull(col2#33)) && ((col1#32 = 1) && (1 = col2#33)))   :- Filter (col2#33 = col1#32)
      !:  +- Relation[col1#32,col2#33] parquet                                                      :  +- Filter ((isnotnull(col1#32) && isnotnull(col2#33)) && ((col1#32 = 1) && (1 = col2#33)))
      !+- Filter ((1 = col#34) && isnotnull(col#34))                                                :     +- Relation[col1#32,col2#33] parquet
      !   +- Relation[col#34] parquet                                                               +- Filter ((1 = col#34) && isnotnull(col#34))
      !                                                                                                +- Relation[col#34] parquet
                      
      
      === Applying Rule org.apache.spark.sql.catalyst.optimizer.CombineFilters ===
       Join Inner, ((col1#32 = col#34) && (col2#33 = col#34))                                          Join Inner, ((col1#32 = col#34) && (col2#33 = col#34))
      !:- Filter (col2#33 = col1#32)                                                                   :- Filter (((isnotnull(col1#32) && isnotnull(col2#33)) && ((col1#32 = 1) && (1 = col2#33))) && (col2#33 = col1#32))
      !:  +- Filter ((isnotnull(col1#32) && isnotnull(col2#33)) && ((col1#32 = 1) && (1 = col2#33)))   :  +- Relation[col1#32,col2#33] parquet
      !:     +- Relation[col1#32,col2#33] parquet                                                      +- Filter ((1 = col#34) && isnotnull(col#34))
      !+- Filter ((1 = col#34) && isnotnull(col#34))                                                      +- Relation[col#34] parquet
      !   +- Relation[col#34] parquet                                                                  
                      
      
      === Applying Rule org.apache.spark.sql.catalyst.optimizer.ConstantPropagation ===
       Join Inner, ((col1#32 = col#34) && (col2#33 = col#34))                                                                Join Inner, ((col1#32 = col#34) && (col2#33 = col#34))
      !:- Filter (((isnotnull(col1#32) && isnotnull(col2#33)) && ((col1#32 = 1) && (1 = col2#33))) && (col2#33 = col1#32))   :- Filter (((isnotnull(col1#32) && isnotnull(col2#33)) && ((col1#32 = 1) && (1 = col2#33))) && (1 = 1))
       :  +- Relation[col1#32,col2#33] parquet                                                                               :  +- Relation[col1#32,col2#33] parquet
       +- Filter ((1 = col#34) && isnotnull(col#34))                                                                         +- Filter ((1 = col#34) && isnotnull(col#34))
          +- Relation[col#34] parquet                                                                                           +- Relation[col#34] parquet
                      
      
      === Applying Rule org.apache.spark.sql.catalyst.optimizer.ConstantFolding ===
       Join Inner, ((col1#32 = col#34) && (col2#33 = col#34))                                                    Join Inner, ((col1#32 = col#34) && (col2#33 = col#34))
      !:- Filter (((isnotnull(col1#32) && isnotnull(col2#33)) && ((col1#32 = 1) && (1 = col2#33))) && (1 = 1))   :- Filter (((isnotnull(col1#32) && isnotnull(col2#33)) && ((col1#32 = 1) && (1 = col2#33))) && true)
       :  +- Relation[col1#32,col2#33] parquet                                                                   :  +- Relation[col1#32,col2#33] parquet
       +- Filter ((1 = col#34) && isnotnull(col#34))                                                             +- Filter ((1 = col#34) && isnotnull(col#34))
          +- Relation[col#34] parquet                                                                               +- Relation[col#34] parquet
                      
      
      === Applying Rule org.apache.spark.sql.catalyst.optimizer.BooleanSimplification ===
       Join Inner, ((col1#32 = col#34) && (col2#33 = col#34))                                                 Join Inner, ((col1#32 = col#34) && (col2#33 = col#34))
      !:- Filter (((isnotnull(col1#32) && isnotnull(col2#33)) && ((col1#32 = 1) && (1 = col2#33))) && true)   :- Filter ((isnotnull(col1#32) && isnotnull(col2#33)) && ((col1#32 = 1) && (1 = col2#33)))
       :  +- Relation[col1#32,col2#33] parquet                                                                :  +- Relation[col1#32,col2#33] parquet
       +- Filter ((1 = col#34) && isnotnull(col#34))                                                          +- Filter ((1 = col#34) && isnotnull(col#34))
          +- Relation[col#34] parquet                                                                            +- Relation[col#34] parquet
                      
      
      === Applying Rule org.apache.spark.sql.catalyst.optimizer.InferFiltersFromConstraints ===
      !Join Inner, ((col1#32 = col#34) && (col2#33 = col#34))                                       Join Inner, ((col2#33 = col1#32) && ((col1#32 = col#34) && (col2#33 = col#34)))
       :- Filter ((isnotnull(col1#32) && isnotnull(col2#33)) && ((col1#32 = 1) && (1 = col2#33)))   :- Filter ((isnotnull(col1#32) && isnotnull(col2#33)) && ((col1#32 = 1) && (1 = col2#33)))
       :  +- Relation[col1#32,col2#33] parquet                                                      :  +- Relation[col1#32,col2#33] parquet
       +- Filter ((1 = col#34) && isnotnull(col#34))                                                +- Filter ((1 = col#34) && isnotnull(col#34))
          +- Relation[col#34] parquet                                                                  +- Relation[col#34] parquet
      

      Attachments

        Issue Links

          Activity

            People

              smilegator Xiao Li
              aokolnychyi Anton Okolnychyi
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: