Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19638

Filter pushdown not working for struct fields

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 2.1.0
    • Fix Version/s: None
    • Component/s: SQL
    • Labels:
      None

      Description

      Working with a dataset containing struct fields, and enabling debug logging in the ES connector, I'm seeing the following behavior. The dataframe is created over the ES connector and then the schema is extended with a couple column aliases, such as.

      df.withColumn("f2", df("foo"))
      

      Queries vs those alias columns work as expected for fields that are non-struct members.

      scala> df.withColumn("f2", df("foo")).where("f2 == '1'").limit(0).show
      17/02/16 15:06:49 DEBUG DataSource: Pushing down filters [IsNotNull(foo),EqualTo(foo,1)]
      17/02/16 15:06:49 TRACE DataSource: Transformed filters into DSL [{"exists":{"field":"foo"}},{"match":{"foo":"1"}}]
      

      However, try the same with an alias over a struct field, and no filters are pushed down.

      scala> df.withColumn("bar_baz", df("bar.baz")).where("bar_baz == '1'").limit(1).show
      

      In fact, this is the case even when no alias is used at all.

      scala> df.where("bar.baz == '1'").limit(1).show
      

      Basically, pushdown for structs doesn't work at all.

      Maybe this is specific to the ES connector?

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                ndimiduk Nick Dimiduk
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: