Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23802

PropagateEmptyRelation can leave query plan in unresolved state

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.3.0
    • Fix Version/s: 2.3.1, 2.4.0
    • Component/s: SQL
    • Labels:
      None

      Description

      Since https://github.com/apache/spark/pull/19825 PropagateEmptyRelation has been taught to handle more cases it can cause the optimized query plan to be unresolved.

      Simple repro is to run following through the optimizer

      LocalRelation.fromExternalRows(Seq('a.int), data = Seq(Row(1))) .join(LocalRelation('a.int, 'b.int), UsingJoin(FullOuter, "a" :: Nil), None)

      Which results in

      Project [coalesce(a#0, null) AS a#7, null AS b#6]
      +- LocalRelation [a#0]

      This then fails type check on coalesce expression since `a` and null have different type.

       

      Simple, targeted fix is to change PropagateEmptyRelation to add casts around nulls. More comprehensive fix would be to run type coercion at the end of optimization so it can fix cases like those. Alternatively the type checking code could treat NullType as equal to any other type and not fail the type check in the first place.

        Attachments

          Activity

            People

            • Assignee:
              robert3005 Robert Kruszewski
              Reporter:
              robert3005 Robert Kruszewski
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: