Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-12957 Derive and propagate data constrains in logical plan
  3. SPARK-13495

Add Null Filters in the query plan for Filters/Joins based on their data constraints

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.0.0
    • SQL
    • None

    Description

      We should add an optimizer rule that attempts to eliminate reading (unnecessary) NULL values if they are not required for correctness by inserting isNotNull filters is the query plan. These filters should be inserted beneath existing Filters and Join operators and are inferred based on their data constraints.

      For example, if we have filter on a = 10, we know that null values will not pass this predicate. So, we can add a IsNotNull below it.

      cc yhuai nongli

      Attachments

        Activity

          People

            sameerag Sameer Agarwal
            sameerag Sameer Agarwal
            Nong Li Nong Li
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: