Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-20938

explain() for datasources implementing CatalystScan does not show pushed predicates correctly

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Incomplete
    • 2.2.0
    • None
    • SQL

    Description

      Actual pushed-down catalyst predicate expressions look not being represented correctly (but only sources filters) when we use a datasource implementing CatalystScan. For example, the below case

      df.filter("cast(a as string) == '1' and a < 3").explain()
      

      prints the plan as below:

      == Physical Plan ==
      *Filter (cast(a#0L as string) = 1)
      +- *Scan SimpleCatalystScan(0,10) [a#0L] PushedFilters: [*LessThan(a,3)], ReadSchema: struct<a:bigint>
      

      Actual predicates via buildScan(requiredColumns: Seq[Attribute], filters: Seq[Expression]) are as below:

      println(filters.mkString("[", ", ", "]"))
      
      [(cast(a#0L as string) = 1), (a#0L < 3)]
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            gurwls223 Hyukjin Kwon
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: