Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-35316

UnwrapCastInBinaryComparison support In/InSet predicate

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.2.0
    • 3.2.0
    • SQL
    • None

    Description

      It will not pushdown filters for In/InSet predicates:

      spark.range(50).selectExpr("cast(id as int) as id").write.mode("overwrite").parquet("/tmp/parquet/t1")
      spark.read.parquet("/tmp/parquet/t1").where("id in (1L, 2L, 4L)").explain
      spark.read.parquet("/tmp/parquet/t1").where("id = 1L or id = 2L or id = 4L").explain
      
      == Physical Plan ==
      *(1) Filter cast(id#5 as bigint) IN (1,2,4)
      +- *(1) ColumnarToRow
         +- FileScan parquet [id#5] Batched: true, DataFilters: [cast(id#5 as bigint) IN (1,2,4)], Format: Parquet, Location: InMemoryFileIndex(1 paths)[file:/tmp/parquet/t1], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<id:int>
      
      
      == Physical Plan ==
      *(1) Filter (((id#7 = 1) OR (id#7 = 2)) OR (id#7 = 4))
      +- *(1) ColumnarToRow
         +- FileScan parquet [id#7] Batched: true, DataFilters: [(((id#7 = 1) OR (id#7 = 2)) OR (id#7 = 4))], Format: Parquet, Location: InMemoryFileIndex(1 paths)[file:/tmp/parquet/t1], PartitionFilters: [], PushedFilters: [Or(Or(EqualTo(id,1),EqualTo(id,2)),EqualTo(id,4))], ReadSchema: struct<id:int>
      

      Attachments

        Issue Links

          Activity

            People

              fchen Fu Chen
              yumwang Yuming Wang
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: