Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-47120

Null comparison push down data filter from subquery produces in NPE in Parquet filter

    XMLWordPrintableJSON

Details

    Description

      This issue has been introduced in https://github.com/apache/spark/pull/41088  where we convert scalar subqueries to literals and then convert the literals to org.apache.spark.sql.sources.Filters. These filters are then pushed down to parquet.

      If the literal is a comparison with null then the parquet filter conversion code throws NPE. 

       

      repro code which results in NPE

      create table t1(d date) using parquet
      create table t2(d date) using parquet
      insert into t1 values date'2021-01-01'
      insert into t2 values (null)
      select * from t1 where 1=1 and d > (select d from t2)

      fix PR 

      Attachments

        Issue Links

          Activity

            People

              cosu Cosmin Dumitru
              cosu Cosmin Dumitru
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: