Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-25419 Parquet predicate pushdown improvement
  3. SPARK-25207

Case-insensitve field resolution for filter pushdown when reading Parquet

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.4.0
    • Fix Version/s: 2.4.0
    • Component/s: SQL
    • Labels:

      Description

      Currently, filter pushdown will not work if Parquet schema and Hive metastore schema are in different letter cases even spark.sql.caseSensitive is false.

      Like the below case:

      spark.range(10).write.parquet("/tmp/data")
      sql("DROP TABLE t")
      sql("CREATE TABLE t (ID LONG) USING parquet LOCATION '/tmp/data'")
      sql("select * from t where id > 0").show

      No filter will be pushed down.

      scala> sql("select * from t where id > 0").explain   // Filters are pushed with `ID`
      == Physical Plan ==
      *(1) Project [ID#90L]
      +- *(1) Filter (isnotnull(id#90L) && (id#90L > 0))
         +- *(1) FileScan parquet default.t[ID#90L] Batched: true, Format: Parquet, Location: InMemoryFileIndex[file:/tmp/data], PartitionFilters: [], PushedFilters: [IsNotNull(ID), GreaterThan(ID,0)], ReadSchema: struct<ID:bigint>
      
      scala> sql("select * from t").show    // Parquet returns NULL for `ID` because it has `id`.
      +----+
      |  ID|
      +----+
      |null|
      |null|
      |null|
      |null|
      |null|
      |null|
      |null|
      |null|
      |null|
      |null|
      +----+
      
      scala> sql("select * from t where id > 0").show   // `NULL > 0` is `false`.
      +---+
      | ID|
      +---+
      +---+
      

        Attachments

        1. image.png
          41 kB
          Dongjoon Hyun

          Issue Links

            Activity

              People

              • Assignee:
                yucai yucai
                Reporter:
                yucai yucai
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: