Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-43596

Subquery decorrelation rewriteDomainJoins failure from ConstantFolding to isnull

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.5.0
    • SQL
    • None

    Description

      We can get a decorrelation error because of rewrites that run in between DecorrelateInnerQuery and rewriteDomainJoins, that modify the correlation join conditions. In particular, ConstantFolding can transform `innercol <=> null` to `isnull(innercol)` and then rewriteDomainJoins does not recognize this and throws error Unable to rewrite domain join with conditions: ArrayBuffer(isnull(innercol#280)) because the isnull is not an equality, so it isn't usable for rewriting the domain join.

      Can fix by recognizing `isnull(innercol)` as `innercol <=> null` in rewriteDomainJoins.

      This area is also fragile in general and other rewrites that run between the two steps of decorrelation could potentially break their assumptions, so we may want to investigate longer-term follow ups for that.

      Attachments

        Activity

          People

            Unassigned Unassigned
            jchen5 Jack Chen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: